geoschem / integrated_methane_inversion

Integrated Methane Inversion workflow repository.
https://imi.readthedocs.org
MIT License
26 stars 23 forks source link

Running IMI in docker container #99

Closed pradhyumna85 closed 5 months ago

pradhyumna85 commented 1 year ago

Hi, I was thinking if this docker image (has compiled gcclassic binary and spack there) can be used to run IMI?

Do we have examples or guilde related to this ?

laestrada commented 1 year ago

Hi @pradhyumna85,

We are planning to create a purpose-built IMI docker container. I currently have a branch feature/imi_dockerfile with a dockerfile that builds spack, conda, and the dependencies for running gcclassic. At this point it is untested and I need to update the dockerfile to install the IMI-specific conda environment and slurm, but it is probably better equipped than the gcclassic 13.3.3.

If there is enough interest in an IMI docker container we can increase the priority. May I ask what your application for a dockerized IMI is?

pradhyumna85 commented 1 year ago

My thinking and usecase is that - having a docker image for a software like gcclassic and Integrated methane inversion makes is effortlessly easy to quicky start running/learning and testing without going through the pain of compilation from source and manually installing all the dependencies.

pradhyumna85 commented 1 year ago

Hi @laestrada, How can I pull 753979222379.dkr.ecr.us-east-1.amazonaws.com/imi-docker-repository:latest to my pc?

Docker pull output gives error regarding credentials on my machine: docker pull 753979222379.dkr.ecr.us-east-1.amazonaws.com/imi-docker-repository:latest

Error response from daemon: Head "https://753979222379.dkr.ecr.us-east-1.amazonaws.com/v2/imi-docker-repository/manifests/latest": no basic auth credentials

laestrada commented 1 year ago

Hi @pradhyumna85,

The docker image listed here is still in development and not quite ready for public distribution, so the repository for the built image is not yet public.

If you'd like to test it out before official release, you can feel free to build the images yourself by slightly modifying the dockerfiles in my docker container branch. Specifically, first build the dockerfile in integrated_methane_inversion/resources/containers/base-image/ (this installs all the dependencies needed for the IMI) and then modify the dockerfile in integrated_methane_inversion/resources/containers/ to use the base-image as the starting point. The second dockerfile adds the imi and geoschem source code.

pradhyumna85 commented 1 year ago

Okay @laestrad, completely make sense. Have you tested running IMI using the latest docker image? Like is it running successfully in some simple tests?

laestrada commented 1 year ago

It more or less works, but does not efficiently allocate the resources given by the host resources yet.

pradhyumna85 commented 1 year ago

@laestrada, what kind of problem is currently there, like it is running slow on some specific parts of run_imi.sh scripts?

As per my experience when I used imi using the docker image, and range the default entrypoint yml case, most time went in downloading data from tropomi - about 8-9 gigs. And after that it took some time and simulation went fine. imi_output.log was also generated.

I did had to setup my AWS credential in /root/.aws/credentials before running the entrypoint script, otherwise the script was failing at tropomi S3 data download step.

Also i started the container with --shm-size=2gb parameter, just to be safe incase.

And also i didn't used sbatch, i just ran the script as ./run_imi.sh .... > log.txt in the entrypoint.sh And also useslurm as false in the yml.

Let me know your view.

laestrada commented 1 year ago

Glad to hear you got it working! Mainly the issue is resource allocation with sbatch. Specifically when running the jacobian simulations we use sbatch to run simulations in parallel, but currently we do not have an automated method for specifying the memory allocated to each jacobian simulation, so it only runs a few simulations at a time (default mem is too much). Like you said, it works, but it is slow. Working on a fix.

FYI You can also pass in your aws credentials as environment variables and specify them in your docker run command.

pradhyumna85 commented 1 year ago

@laestrada, thank you for the info. How much time the default yml case takes to run usually in non container env, just to have a reference?

pradhyumna85 commented 1 year ago

@laestrada, Did you tried various slurm parameters, in slurm config and or passing as sbatch parameters ?

https://slurm.schedmd.com/slurm.conf.html DefMemPerCPU, MaxMemPerCPU

laestrada commented 1 year ago

@pradhyumna85 runtime depends on the number of cores and memory, but using a c5.9xlarge instance on aws we see:

IMI Runtime Total Input Data GEOS-Chem NA Met Fields Boundary Conditions HEMCO TROPOMI data
65 Minutes 28GB 21GB 113MB 3GB 4.3GB

Re slurm: I am updating the IMI config file to let users adjust the amount of simulation memory and cpus, which should fix this. Something we needed to do anyways to accommodate varying region of interest sizes.

pradhyumna85 commented 1 year ago

@laestrada, Thank you for the info and update. Looking forward to the fixes.🙂

pradhyumna85 commented 1 year ago

@laestrada, I ran 1 case (input yml is given below), but the imi_output.log is stuck at:

=== DONE CREATING PREVIEW RUN DIRECTORY ===

=== RUNNING IMI PREVIEW ===
Submitted batch job 3

I have been waiting for it for about 3 hours. Let me know, what can be done to fix this.

squeue output:

             JOBID PARTITION     NAME     USER ST       TIME  NODES NODELIST(REASON)
                 3     debug Test_Per     root PD       0:00      1 (Resources)
                 2     debug run_imi.     root  R    2:59:02      1 d3c3e9775010

System:

input yml:

## IMI configuration file
## Documentation @ https://imi.readthedocs.io/en/latest/getting-started/imi-config-file.html

## General
RunName: "Test_Permian_3days_inv"
isAWS: true
UseSlurm: true
SafeMode: true

## Period of interest
StartDate: 20180501
EndDate: 20180503
SpinupMonths: 1

## Region of interest
##   These lat/lon bounds are only used if CreateAutomaticRectilinearStateVectorFile: true
##   Otherwise lat/lon bounds are determined from StateVectorFile
LonMin: -105
LonMax: -103
LatMin: 31
LatMax: 33

## Use nested grid simulation?
##   Must be "true" for IMI regional inversions
NestedGrid: true

## Select nested grid region (for using pre-cropped meteorological fields)
##   Current options are listed below with ([lat],[lon]) bounds:
##     "AF" : Africa ([-37,40], [-20,53])
##     "AS" : Asia ([-11,55],[60,150]) 
##     "EU" : Europe ([33,61],[-30,70])
##     "ME" : Middle East ([12,44], [-20,70])
##     "NA" : North America ([10,70],[-140,-40])
##     "OC" : Oceania ([-50,5], [110,180])
##     "RU" : Russia ([41,83], [19,180])
##     "SA" : South America ([-59,16], [-88,-31])
##     ""   : Use for global met fields (global simulation/custom nested grids)
##   For example, if the region of interest is in Europe ([33,61],[-30,70]), select "EU".
NestedRegion: "NA"

## State vector
CreateAutomaticRectilinearStateVectorFile: true
nBufferClusters: 8
BufferDeg: 5
LandThreshold: 0.25

## Clustering Options
ReducedDimensionStateVector: false
ClusteringPairs:
  - [1, 15]
  - [2, 24]
ForcedNativeResolutionElements: 
  - [31.5, -104]

## Custom state vector
StateVectorFile: "/home/al2/integrated_methane_inversion/resources/statevectors/StateVector.nc"
ShapeFile: "/home/al2/integrated_methane_inversion/resources/shapefiles/PermianBasin_Extent_201712.shp"

## Inversion
PriorError: 0.5
ObsError: 15
Gamma: 1.0
PrecomputedJacobian: false

## Grid
##   Select "0.25x0.3125" and "geosfp", or "0.5x0.625" and "merra2"
Res: "0.25x0.3125"
Met: "geosfp"

## Setup modules
##   Turn on/off different steps in setting up the inversion 
SetupTemplateRundir: true
SetupSpinupRun: true
SetupJacobianRuns: true
SetupInversion: true
SetupPosteriorRun: true

## Run modules
##   Turn on/off different steps in performing the inversion
RunSetup: true
DoSpinup: true
DoJacobian: true
DoInversion: true
DoPosterior: true

## IMI preview
DoPreview: true
DOFSThreshold: 0

##====================================================================
##
## Advanced Settings (optional)
##
##====================================================================

## These settings are intended for advanced users who wish to:
##   a. modify additional GEOS-Chem options, or
##   b. run the IMI on a local cluster.
## They can be ignored for any standard cloud application of the IMI.

##--------------------------------------------------------------------
## Additional settings for GEOS-Chem simulations
##--------------------------------------------------------------------

## Jacobian settings
PerturbValue: 1.5

## Apply scale factors from a previous inversion?
UseEmisSF: false
UseOHSF: false

## Save out hourly diagnostics from GEOS-Chem?
## For use in satellite operators via post-processing -- required for TROPOMI
## inversions
HourlyCH4: true

## Turn on planeflight diagnostic in GEOS-Chem?
## For use in comparing GEOS-Chem against planeflight data. The path
## to those data must be specified in input.geos.
PLANEFLIGHT: false

## Turn on old observation operators in GEOS-Chem?
## These will save out text files comparing GEOS-Chem to observations, but have
## to be manually incorporated into the IMI
GOSAT: false
TCCON: false
AIRS: false

##--------------------------------------------------------------------
## Settings for running on a local cluster
##--------------------------------------------------------------------

## Path for IMI runs and output
OutputPath: "/home/al2/imi_output_dir"

## Path to GEOS-Chem input data
DataPath: "/home/al2/ExtData"

## Environment files
CondaFile: "/opt/conda/etc/profile.d/conda.sh"
CondaEnv: "imi_env"

## Download initial restart file from AWS S3?
RestartDownload: true

## Path to initial GEOS-Chem restart file + prefix
##   ("YYYYMMDD_0000z.nc4" will be appended)
RestartFilePrefix: "/home/al2/ExtData/BoundaryConditions/GEOSChem.BoundaryConditions."
RestartFilePreviewPrefix: "/home/al2/ExtData/BoundaryConditions/GEOSChem.BoundaryConditions."

## Path to GEOS-Chem boundary condition files (for nested grid simulations)
BCpath: "/home/al2/ExtData/BoundaryConditions"

## Options to download missing GEOS-Chem input data from AWS S3
##   NOTE: You will be charged if your ec2 instance is not in the
##         us-east-1 region.
PreviewDryRun: true
SpinupDryrun: true
ProductionDryRun: true
PosteriorDryRun: true
BCdryrun: true

laestrada commented 1 year ago

I suspect that because there are no slurm resources specified in run_imi.sh it is taking all available resources and so the secondary sbatch job has no resources available to start. you can run scontrol show job <id-number> for more information on the job. I would try updating the entrypoint.sh to launch with memory and cpus specified eg:

sbatch -W --mem 2000 -c 1 run_imi.sh resources/containers/container_config.yml; wait;

Or alternatively just do: ./run_imi.sh resources/containers/container_config.yml

Please report back if this works... I have only tested using a c5.9xlarge instance (36 cores, 72GB memory) so am curious if it works with your current system.

pradhyumna85 commented 1 year ago

@laestrada,

bash-4.2# scontrol show job 3

JobId=3 JobName=Test_Permian_3days_inv_Preview.run
UserId=root(0) GroupId=root(0) MCS_label=N/A
Priority=4294901758 Nice=0 Account=(null) QOS=(null)
JobState=PENDING Reason=Resources Dependency=(null)
Requeue=1 Restarts=0 BatchFlag=1 Reboot=0 ExitCode=0:0
RunTime=00:00:00 TimeLimit=UNLIMITED TimeMin=N/A
SubmitTime=2023-05-04T16:11:22 EligibleTime=2023-05-04T16:11:22
AccrueTime=2023-05-04T16:11:22
StartTime=2024-05-03T15:56:17 EndTime=Unknown Deadline=N/A
SuspendTime=None SecsPreSuspend=0 LastSchedEval=2023-05-04T18:02:15 Scheduler=Main
Partition=debug AllocNode:Sid=d3c3e9775010:20704
ReqNodeList=(null) ExcNodeList=(null)
NodeList=
NumNodes=1-1 NumCPUs=15 NumTasks=1 CPUs/Task=15 ReqB:S:C:T=0:0:*:*
ReqTRES=cpu=15,mem=15902M,node=1,billing=15
AllocTRES=(null)
Socks/Node=* NtasksPerN:B:S:C=0:0:*:* CoreSpec=*
MinCPUsNode=15 MinMemoryNode=0 MinTmpDiskNode=0
Features=(null) DelayBoot=00:00:00
OverSubscribe=OK Contiguous=0 Licenses=(null) Network=(null)
Command=/home/al2/imi_output_dir/Test_Permian_3days_inv/preview_run/Test_Permian_3days_inv_Preview.run
WorkDir=/home/al2/imi_output_dir/Test_Permian_3days_inv/preview_run
StdErr=/home/al2/imi_output_dir/Test_Permian_3days_inv/preview_run/slurm-3.out
StdIn=/dev/null
StdOut=/home/al2/imi_output_dir/Test_Permian_3days_inv/preview_run/slurm-3.out
Power=
MailUser=root MailType=END

Let me know what you think from the above output.

I will also try the changes you suggested and report you back.

laestrada commented 1 year ago

Yeah JobState=PENDING Reason=Resources implies the resources requested to run the job with are not free (15 cores and 15902MB). This wont be an issue once I finish my branch allowing specification of memory and CPU number from the config file. For now changing entrypoint.sh to use my second suggestion should work....

pradhyumna85 commented 1 year ago

@laestrada, I re-ran with the --mem 2000 -c 1 parameters in the entrypoint.sh, now also the second job still shows pending.

squeue

JOBID PARTITION     NAME     USER ST       TIME  NODES NODELIST(REASON)
6     debug Test_Per     root PD       0:00      1 (Resources)
5     debug run_imi.     root  R       4:21      1 d3c3e9775010

scontrol show job 6

JobId=6 JobName=Test_Permian_3days_inv_Preview.run
UserId=root(0) GroupId=root(0) MCS_label=N/A
Priority=4294901755 Nice=0 Account=(null) QOS=(null)
JobState=PENDING Reason=Resources Dependency=(null)
Requeue=1 Restarts=0 BatchFlag=1 Reboot=0 ExitCode=0:0
RunTime=00:00:00 TimeLimit=UNLIMITED TimeMin=N/A
SubmitTime=2023-05-04T20:25:26 EligibleTime=2023-05-04T20:25:26
AccrueTime=2023-05-04T20:25:26
StartTime=2024-05-03T20:23:56 EndTime=Unknown Deadline=N/A
SuspendTime=None SecsPreSuspend=0 LastSchedEval=2023-05-04T20:26:53 Scheduler=Main
Partition=debug AllocNode:Sid=d3c3e9775010:384725
ReqNodeList=(null) ExcNodeList=(null)
NodeList=
NumNodes=1-1 NumCPUs=15 NumTasks=1 CPUs/Task=15 ReqB:S:C:T=0:0:*:*
ReqTRES=cpu=15,mem=15902M,node=1,billing=15
AllocTRES=(null)
Socks/Node=* NtasksPerN:B:S:C=0:0:*:* CoreSpec=*
MinCPUsNode=15 MinMemoryNode=0 MinTmpDiskNode=0
Features=(null) DelayBoot=00:00:00
OverSubscribe=OK Contiguous=0 Licenses=(null) Network=(null)
Command=/home/al2/imi_output_dir/Test_Permian_3days_inv/preview_run/Test_Permian_3days_inv_Preview.run
WorkDir=/home/al2/imi_output_dir/Test_Permian_3days_inv/preview_run
StdErr=/home/al2/imi_output_dir/Test_Permian_3days_inv/preview_run/slurm-6.out
StdIn=/dev/null
StdOut=/home/al2/imi_output_dir/Test_Permian_3days_inv/preview_run/slurm-6.out
Power=
MailUser=root MailType=END

Windows ram usage from task manager:

image

laestrada commented 1 year ago

Yes, you should replace that command with ./run_imi.sh resources/containers/container_config.yml. The issue is that the preview slurm job, by default, is requesting all resources in the container. This means that it is waiting for job 5 to finish and free up resources, but job 5 is waiting for job 6 to finish.. and so on. So until we have config variables setup to allocate resources for the various sbatch jobs that get run the run command for the imi will need to be called directly.

pradhyumna85 commented 1 year ago

@laestrada I ran successfully after the latest commits. Thanks a lot for the support. However, for one of the case run - Permian with 4 days window, I see some errors in the logs:

=== DONE CREATING TEMPLATE RUN DIRECTORY ===

=== CREATING IMI PREVIEW RUN DIRECTORY ===
mkdir: created directory ‘preview_run’

Executing dry-run for preview run...
Log with unique file paths written to: log.dryrun.unique
Downloading data from amazon

=== DONE CREATING PREVIEW RUN DIRECTORY ===

=== RUNNING IMI PREVIEW ===
Submitted batch job 1
Submitted batch job 2

=== DONE RUNNING IMI PREVIEW ===
src/components/preview_component/preview.sh: line 106: bc: command not found
src/components/preview_component/preview.sh: line 106: [: : integer expression expected

=== CREATING SPINUP RUN DIRECTORY ===
mkdir: created directory ‘spinup_run’

WARNING: Changing restart field entry in HEMCO_Config.rc to read the field from a boundary condition file. Please revert SpeciesBC_ back to SpeciesRst_ for subsequent runs.

Executing dry-run for spinup run...
Log with unique file paths written to: log.dryrun.unique
Downloading data from amazon

=== DONE CREATING SPINUP RUN DIRECTORY ===

=== CREATING POSTERIOR RUN DIRECTORY ===
mkdir: created directory ‘posterior_run’

Executing dry-run for posterior run...
fatal error: An error occurred (404) when calling the HeadObject operation: Key "GEOSCHEM_RESTARTS/v2020-02/initial_GEOSChem_rst.2x25_CH4.nc" does not exist
Log with unique file paths written to: log.dryrun.unique
Downloading data from amazon

=== DONE CREATING POSTERIOR RUN DIRECTORY ===

=== CREATING JACOBIAN RUN DIRECTORIES ===
mkdir: created directory ‘jacobian_runs’
mkdir: created directory ‘./jacobian_runs/Test_Permian_4days_0000’

Executing dry-run for production runs...
fatal error: An error occurred (404) when calling the HeadObject operation: Key "GEOSCHEM_RESTARTS/v2020-02/initial_GEOSChem_rst.2x25_CH4.nc" does not exist
Log with unique file paths written to: log.dryrun.unique
Downloading data from amazon
mkdir: created directory ‘./jacobian_runs/Test_Permian_4days_0001’
mkdir: created directory ‘./jacobian_runs/Test_Permian_4days_0002’
mkdir: created directory ‘./jacobian_runs/Test_Permian_4days_0003’
.
.
.

error lines:

=== DONE RUNNING IMI PREVIEW ===
src/components/preview_component/preview.sh: line 106: bc: command not found
src/components/preview_component/preview.sh: line 106: [: : integer expression expected

___

Executing dry-run for production runs...
fatal error: An error occurred (404) when calling the HeadObject operation: Key "GEOSCHEM_RESTARTS/v2020-02/initial_GEOSChem_rst.2x25_CH4.nc" does not exist
Log with unique file paths written to: log.dryrun.unique
Downloading data from amazon

input yaml:

## IMI configuration file
## Documentation @ https://imi.readthedocs.io/en/latest/getting-started/imi-config-file.html

## General
RunName: "Test_Permian_4days"
isAWS: true
UseSlurm: true
SafeMode: true

## Period of interest
StartDate: 20221225
EndDate: 20221230
SpinupMonths: 1

## Region of interest
##   These lat/lon bounds are only used if CreateAutomaticRectilinearStateVectorFile: true
##   Otherwise lat/lon bounds are determined from StateVectorFile
LonMin: -105
LonMax: -103
LatMin: 31
LatMax: 33

## Use nested grid simulation?
##   Must be "true" for IMI regional inversions
NestedGrid: true

## Select nested grid region (for using pre-cropped meteorological fields)
##   Current options are listed below with ([lat],[lon]) bounds:
##     "AF" : Africa ([-37,40], [-20,53])
##     "AS" : Asia ([-11,55],[60,150]) 
##     "EU" : Europe ([33,61],[-30,70])
##     "ME" : Middle East ([12,44], [-20,70])
##     "NA" : North America ([10,70],[-140,-40])
##     "OC" : Oceania ([-50,5], [110,180])
##     "RU" : Russia ([41,83], [19,180])
##     "SA" : South America ([-59,16], [-88,-31])
##     ""   : Use for global met fields (global simulation/custom nested grids)
##   For example, if the region of interest is in Europe ([33,61],[-30,70]), select "EU".
NestedRegion: "NA"

## State vector
CreateAutomaticRectilinearStateVectorFile: true
nBufferClusters: 8
BufferDeg: 5
LandThreshold: 0.25

## Clustering Options
ReducedDimensionStateVector: false
ClusteringPairs:
  - [1, 15]
  - [2, 24]
ForcedNativeResolutionElements: 
  - [31.5, -104]

## Custom state vector
StateVectorFile: "/home/al2/integrated_methane_inversion/resources/statevectors/StateVector.nc"
ShapeFile: "/home/al2/integrated_methane_inversion/resources/shapefiles/PermianBasin_Extent_201712.shp"

## Inversion
PriorError: 0.5
ObsError: 15
Gamma: 1.0
PrecomputedJacobian: false

## Grid
##   Select "0.25x0.3125" and "geosfp", or "0.5x0.625" and "merra2"
Res: "0.25x0.3125"
Met: "geosfp"

## Setup modules
##   Turn on/off different steps in setting up the inversion 
SetupTemplateRundir: true
SetupSpinupRun: true
SetupJacobianRuns: true
SetupInversion: true
SetupPosteriorRun: true

## Run modules
##   Turn on/off different steps in performing the inversion
RunSetup: true
DoSpinup: true
DoJacobian: true
DoInversion: true
DoPosterior: true

## IMI preview
DoPreview: true
DOFSThreshold: 0

##====================================================================
##
## Advanced Settings (optional)
##
##====================================================================

## These settings are intended for advanced users who wish to:
##   a. modify additional GEOS-Chem options, or
##   b. run the IMI on a local cluster.
## They can be ignored for any standard cloud application of the IMI.

##--------------------------------------------------------------------
## Additional settings for GEOS-Chem simulations
##--------------------------------------------------------------------

## Jacobian settings
PerturbValue: 1.5

## Apply scale factors from a previous inversion?
UseEmisSF: false
UseOHSF: false

## Save out hourly diagnostics from GEOS-Chem?
## For use in satellite operators via post-processing -- required for TROPOMI
## inversions
HourlyCH4: true

## Turn on planeflight diagnostic in GEOS-Chem?
## For use in comparing GEOS-Chem against planeflight data. The path
## to those data must be specified in input.geos.
PLANEFLIGHT: false

## Turn on old observation operators in GEOS-Chem?
## These will save out text files comparing GEOS-Chem to observations, but have
## to be manually incorporated into the IMI
GOSAT: false
TCCON: false
AIRS: false

## resources to allocate to slurm jobs
SimulationCPUs: 32
SimulationMemory: 32000
JacobianCPUs: 1
JacobianMemory: 2000
RequestedTime: "0-6:00"

##--------------------------------------------------------------------
## Settings for running on a local cluster
##--------------------------------------------------------------------

## Path for IMI runs and output
OutputPath: "/home/al2/imi_output_dir"

## Path to GEOS-Chem input data
DataPath: "/home/al2/ExtData"

## Environment files
CondaFile: "/opt/conda/etc/profile.d/conda.sh"
CondaEnv: "imi_env"

## Download initial restart file from AWS S3?
RestartDownload: true

## Path to initial GEOS-Chem restart file + prefix
##   ("YYYYMMDD_0000z.nc4" will be appended)
RestartFilePrefix: "/home/al2/ExtData/BoundaryConditions/GEOSChem.BoundaryConditions."
RestartFilePreviewPrefix: "/home/al2/ExtData/BoundaryConditions/GEOSChem.BoundaryConditions."

## Path to GEOS-Chem boundary condition files (for nested grid simulations)
BCpath: "/home/al2/ExtData/BoundaryConditions"

## Options to download missing GEOS-Chem input data from AWS S3
##   NOTE: You will be charged if your ec2 instance is not in the
##         us-east-1 region.
PreviewDryRun: true
SpinupDryrun: true
ProductionDryRun: true
PosteriorDryRun: true
BCdryrun: true

Let me know your thoughts.

thank you.

laestrada commented 1 year ago

Hi @pradhyumna85,

=== DONE RUNNING IMI PREVIEW ===
src/components/preview_component/preview.sh: line 106: bc: command not found
src/components/preview_component/preview.sh: line 106: [: : integer expression expected

This suggests that this check on the expected DOFS from the IMI preview is failing. This error check is not performing as intended based on your output.

Is your run producing a preview_diagnostics.txt file in the preview_run directory? If not this could be indicative of a failure in the GEOS-Chem simulation or subsequent processing by the imi_preview.py script.

As for this error:


Executing dry-run for production runs...
fatal error: An error occurred (404) when calling the HeadObject operation: Key "GEOSCHEM_RESTARTS/v2020-02/initial_GEOSChem_rst.2x25_CH4.nc" does not exist
Log with unique file paths written to: log.dryrun.unique
Downloading data from amazon

It is a known issue with the data download and does not affect successful execution of the IMI.

pradhyumna85 commented 1 year ago

@laestrada, I did found preview_diagnostics.txt file in the simulation's preview_run folder. Have a look at the contents of the file:

~$ cat /home/al2/imi_output_dir/Test_Permian_4days/preview_run/preview_diagnostics.txt
##Found 835 observations in the region of interest
##approximate cost = $0.54 for on-demand instance
##                 = $0.18 for spot instance
##Total prior emissions in region of interest = 0.276890150265462 Tg/y

##k = [1.25903 1.25903 1.25903 1.25903 1.25903 1.25903 1.25903 1.25903 1.25903
 1.25903 1.25903 1.25903 1.25903 1.25903 1.25903 1.25903 1.25903 1.25903
 1.25903 1.25903 1.25903 1.25903 1.25903 1.25903 1.25903 1.25903 1.25903
 1.25903 1.25903 1.25903 1.25903 1.25903 1.25903 1.25903 1.25903 1.25903
 1.25903 1.25903 1.25903 1.25903 1.25903 1.25903 1.25903 1.25903 1.25903
 1.25903 1.25903 1.25903 1.25903 1.25903 1.25903 1.25903 1.25903 1.25903
 1.25903 1.25903 1.25903 1.25903 1.25903 1.25903 1.25903 1.25903 1.25903] kg-1 m2 s
##a = [7.100e-04 1.890e-03 5.630e-03 5.250e-03 3.100e-03 1.200e-04 0.000e+00
 3.510e-03 2.590e-03 1.642e-02 7.960e-03 1.257e-02 5.500e-04 0.000e+00
 6.230e-03 9.900e-04 6.200e-04 1.900e-04 7.700e-04 7.200e-04 0.000e+00
 9.620e-03 1.000e-04 9.600e-04 2.200e-04 5.300e-04 1.000e-05 0.000e+00
 1.379e-02 3.300e-04 1.420e-03 7.100e-04 4.100e-04 0.000e+00 0.000e+00
 1.560e-03 4.820e-03 2.200e-04 5.000e-05 0.000e+00 0.000e+00 1.000e-05
 3.670e-03 7.210e-03 1.400e-04 1.200e-04 0.000e+00 0.000e+00 0.000e+00
 4.740e-03 2.390e-03 1.600e-04 0.000e+00 0.000e+00 0.000e+00 0.000e+00
 5.100e-04 7.000e-05 6.000e-05 0.000e+00 0.000e+00 1.000e-05 0.000e+00]

expectedDOFS: 0.12365

Let me know your thoughts.

laestrada commented 1 year ago

Hmm, I think this is a bug. Your diagnostics file looks fine, so it shouldn't be tripping that if statement and proceeding to bc ( also a bug). I will do some further investigation and push a fix for this.

laestrada commented 1 year ago

@pradhyumna85 I looked into the bug and bc is a dependency I did not add to the base container. It seems to be a fairly needless dependency, so I moved the DOFS threshold check to within the imi_preview.py python script to eliminate the need for bc. The change has been pushed to the dockerfile branch.

Additionally, I have updated the branch to use a compose.yml file to more easily configure and update the imi config.yml file without having to manually rebuild the container every time. I have added documentation to the readme on how this works.

github-actions[bot] commented 5 months ago

This issue has been automatically marked as stale because it has not had recent activity. If there are no updates within 7 days it will be closed. You can add the "never stale" tag to prevent the issue from closing this issue.

github-actions[bot] commented 5 months ago

Closing due to inactivity