Open dlebauer opened 4 years ago
(NB updated original post w/ this info)
We should look at https://github.com/PecanProject/pecan/blob/ec65c3c47a080729bd31832ca258c2e1fbe96dfb/models/ed/inst/template.job#L27
"@BINARY@" "@BINARY_ARGS@"
@BINARY@ comes from the database and is in the dbfiles table for that specific machine @BINARY_ARGS@ comes from config.php and is passed along in the pecan.xml
I'd propose a "binary" for ed2 (rgit) that is the shell script to actually execute the singularity image. The trick will be to make sure we have the right folder included. The main folder that will need to be mounted is the current working directory which contains the ED2IN and config.xml files.
@robkooper should we use singularity to call model2netcdf.ed at the end of job.sh? Or can we stick with the existing approach (which presumably requires installing PEcAn.ED on the remote)?
From an install perspective, I think it'd be much simpler to have model2netcdf within the model container, not outside of it. Plus then the container only needs to return the pecan standard output
This issue is stale because it has been open 365 days with no activity.
@KristinaRiemer and @julianpistorius what is the status of this issue?
We have the ed2_2.2.9_singularity.sh, I'm not sure if that's sufficient for this?
here is what is in the file kristina linked to https://github.com/az-digitalag/model-vignettes/blob/34a0d48f7fcb2dfe86e5b3ae828014d17e4fa371/ED2/ed2_2.2.0_singularity.sh
#!/bin/bash
module load singularity
pwd
singularity run -B ~/pecan/sites:/data/sites \
-B ~/pecan/inputs/ed_inputs:/data/ed_inputs \
-B ~/pecan/inputs/faoOLD:/data/faoOLD \
-B ~/pecan/inputs/oge2OLD:/data/oge2OLD \
-B ~/pecan/tests/ed2:/data/tests/ed2 \
~/pecan/pecan-model-ed2.sif /usr/local/bin/ed.2.2.0 -s
but the job.sh also needs the line that calls model2netcdf.ED2 - is there a separate issue for that? In any case, we should get our worked checked in as soon as it is viable.
job.sh file is missing module load R
. I think that might be it. Here's the relevant section of a job.sh currently:
# convert to MsTMIP
Rscript \
-e "library(PEcAn.ED2)" \
-e "model2netcdf.ED2('/groups/dlebauer/ed2_results/pecan_remote/2022-09-19-18-42-41/out/ENS-00042-1000000009', 35.8031, -76.6679, '2009-01-01', '2015-12-31', c('SetariaWT','temperate.Southern_Pine'))"
STATUS=$?
And the resulting error in the logfile.txt:
./job.sh: line 52: Rscript: command not found
If you look at the config.php, you can see how that works from the UI:
$hostlist=array($fqdn => array(),
"geo.bu.edu" =>
array("displayname" => "geo",
"qsub" => "qsub -V -N @NAME@ -o @STDOUT@ -e @STDERR@ -S /bin/bash",
"jobid" => "Your job ([0-9]+) .*",
"qstat" => "qstat -j @JOBID@ || echo DONE",
"prerun" => "module load udunits R/R-3.0.0_gnu-4.4.6",
"postrun" => "sleep 60",
"models" =>
array("ED2" =>
array("prerun" => "module load hdf5"),
"ED2 (r82)" =>
array("prerun" => "module load hdf5")
)
)
);
This is then inserted in the pecan.xml at (for the model specific prerun):
<model>
<prerun>.....</prerun
</model>
and for the host prerun:
<host>
<prerun>....</prerun>
</host>
and will then get inserted in the job.sh
@Aariq maybe need to uncomment these lines so that the PEcAn.ED2 package is installed? https://github.com/PecanProject/pecan/blob/develop/models/ed/Dockerfile#L52
Then in postrun tag you can call something like a postrun.sh that looks something like:
#!/bin/bash
module load singularity
singularity run ~/pecan/pecan-model-ed2.sif /usr/local/bin/Rscript -e
"PEcAn.ED2::model2netcdf.ED2('/groups/dlebauer/ed2_results/pecan_remote/2021-09-13-18-49-11/out/ENS-00008-76',
40.0637, -88.202, '2004/04/01', '2004/08/31', c('SetariaWT','ebifarm.c3grass'))"
Not sure the easiest way to generate this per-run (could add it to the job.sh? read parameters from config.xml in run directory?)
since it starts from pecan/models, this include all model R files.
Currently it looks like we're using a module load openmpi3
, so I think that can just be changed to module load openmpi3 R
. That would hopefully at least change the error from "command Rscript not found" to something like "PEcAn.ED2 isn't installed". Then I can try uncommenting those lines in the Dockerfile.
Following is the dependency graph:
rocker/tidyverse
pecan/depends
pecan/base
pecan/models
pecan/model-ed2-2.2.0
So I think PEcAn.ED2 should be installed.
With <job.sh>module load openmpi3 R</job.sh>
I now get this in the logfile:
Error in library(PEcAn.ED2) : there is no package called ‘PEcAn.ED2’
Execution halted
ERROR IN model2netcdf.ED2
So I think either PEcAn.ED2 isn't installed in the container or the code in job.sh isn't running inside the container. Looking through the discussion on this issue it looks like the plan was to have model2netcdf.ED2()
run inside the singularity container, but not clear to me if it was implemented.
can you make sure that the R path is correct?
Tried this manual edit to job.sh like @dlebauer suggested:
# convert to MsTMIP
module load singularity
singularity run ~/pecan/pecan-model-ed2.sif /usr/local/bin/Rscript \
-e "library(PEcAn.ED2)" \
-e "model2netcdf.ED2('/groups/dlebauer/ed2_results/pecan_remote/2022-09-21-22-51-12/out/ENS-00001-76', 40.0637, -88.202, '2004-07-01', '2004-08-01', c('SetariaWT','ebifarm.c3grass'))"
Still results in a there is no package called PEcAn.ED2
error in the logfile.
Whoops, I fixed the path to the .sif file to be absolute and it does now run model2netcdf.ED2()
. So this theoretically works, but need to think about design stuff to make this work without manually editing job.sh
talking to @Aariq
thinking is to add new parameter to modellauncer called job, modify the code at https://github.com/PecanProject/pecan/blob/develop/base/workflow/R/start_model_runs.R#L117 to pass that parameter to setup and write it at https://github.com/PecanProject/pecan/blob/develop/base/workflow/R/start_model_runs.R#L117 to the model launcher script
No longer change the binary in the job.sh, leave that as is.
create shell script to run singularity:
#!/bin/bash
module load singularity
pwd
singularity run --pwd=$(pwd) --contain \
-B /groups/dlebauer/ed2_results/pecan_remote:/groups/dlebauer/ed2_results/pecan_remote \
-B /groups/dlebauer/ed2_results/inputs/julianp/sites:/data/sites \
-B /groups/dlebauer/ed2_results/inputs/julianp/ed_inputs:/data/ed_inputs \
-B /groups/dlebauer/ed2_results/inputs/julianp/faoOLD:/data/faoOLD \
-B /groups/dlebauer/ed2_results/inputs/julianp/oge2OLD:/data/oge2OLD \
-B /groups/dlebauer/ed2_results/inputs/julianp/tests/ed2:/data/tests/ed2 \
/groups/dlebauer/ed2_results/global_inputs/pecan-dev_ed2-dev.sif \
./job.sh
we don't change the binary for the model in pecan.xml, that should be /usr/local/bin/ed2.git
Description
Outcomes:
ed_singularity.sh
that assumes everything is available in the PWDProposed Solution
need to have a shell script that we can call with the following approach that is in https://github.com/PecanProject/pecan/blob/ec65c3c47a080729bd31832ca258c2e1fbe96dfb/models/ed/inst/template.job#L27:
@BINARY@ comes from the database and is in the dbfiles table for that specific machine @BINARY_ARGS@ comes from config.php and is passed along in the pecan.xml
(from Rob) I'd propose a "binary" for ed2 (rgit) that is the shell script to actually execute the singularity image. The trick will be to make sure we have the right folder included. The main folder that will need to be mounted is the current working directory which contains the ED2IN and config.xml files.
And it is possible to mount the current working directory using
where
/work
is where ED2IN and config.xml are andB ${PWD}:/work
mounts the folder.note that the ED2IN can be parsed to figure out the arguments to tell the singularity container where the inputs / data are
Will start with ED2.r82 https://github.com/PecanProject/pecan/blob/develop/models/ed/inst/ED2IN.r82
Notes
Need to ask @ashiklom where this fits in the context of his work to date, e.g. linking w/ https://github.com/PecanProject/pecan/blob/develop/models/ed/R/run_ed_singularity.R
Some grepping for specific components of the file path like:
find all paths