PecanProject / pecan

The Predictive Ecosystem Analyzer (PEcAn) is an integrated ecological bioinformatics toolbox.
www.pecanproject.org
Other
202 stars 234 forks source link

Write shell script to run ed2 singularity using existing jobsh templates #2540

Open dlebauer opened 4 years ago

dlebauer commented 4 years ago

Description

Outcomes:

Proposed Solution

need to have a shell script that we can call with the following approach that is in https://github.com/PecanProject/pecan/blob/ec65c3c47a080729bd31832ca258c2e1fbe96dfb/models/ed/inst/template.job#L27:

@BINARY@ comes from the database and is in the dbfiles table for that specific machine @BINARY_ARGS@ comes from config.php and is passed along in the pecan.xml

(from Rob) I'd propose a "binary" for ed2 (rgit) that is the shell script to actually execute the singularity image. The trick will be to make sure we have the right folder included. The main folder that will need to be mounted is the current working directory which contains the ED2IN and config.xml files.

ed_singularity.sh ARGS

And it is possible to mount the current working directory using

-B {PWD}:/work --pwd /work

where /work is where ED2IN and config.xml are and B ${PWD}:/work mounts the folder.

note that the ED2IN can be parsed to figure out the arguments to tell the singularity container where the inputs / data are

#!bin/bash
data={some magic to get from ED2IN} 
singularity exec -B ${data}/ed_inputs:/data/ed_inputs -B ${data}/faoOLD:/data/faoOLD -B ${data}/oge2OLD:/data/oge2OLD -B ${data}/sites:/data/sites -B {PWD}:/work --pwd /work ./model-ed2-git.simg ed2.git -s

Will start with ED2.r82 https://github.com/PecanProject/pecan/blob/develop/models/ed/inst/ED2IN.r82

Notes

Need to ask @ashiklom where this fits in the context of his work to date, e.g. linking w/ https://github.com/PecanProject/pecan/blob/develop/models/ed/R/run_ed_singularity.R

Some grepping for specific components of the file path like:

grep "'/" ED2IN

NL%FFILOUT = '/data/testrun.s83/analy/ts83' NL%SFILOUT = '/data/testrun.s83/histo/ts83' ! NL%SFILIN = '/mypath/P' ! NL%SFILIN = '/data/sites/Santarem_Km83/s83_default.' NL%VEGDATABASE = '/data/oge2OLD/OGE2' NL%SOILDATABASE = '/data/faoOLD/FAO' NL%LU_DATABASE = '/data/ed_inputs/glu/' NL%THSUMS_DATABASE = '/data/ed_inputs/' NL%ED_MET_DRIVER_DB = '/data/sites/Santarem_Km83/ED_MET_DRIVER_HEADER' NL%PHENPATH = '/n/moorcroft_data/data/ed2_data/phenology/phenology'

find all paths

grep "'/" ED2IN | sed 's/!.*//' | sed "s/^.*'\(.*\)'$/\1/" | grep -v '^ ' 
robkooper commented 4 years ago

(NB updated original post w/ this info)

We should look at https://github.com/PecanProject/pecan/blob/ec65c3c47a080729bd31832ca258c2e1fbe96dfb/models/ed/inst/template.job#L27

"@BINARY@" "@BINARY_ARGS@"

@BINARY@ comes from the database and is in the dbfiles table for that specific machine @BINARY_ARGS@ comes from config.php and is passed along in the pecan.xml

I'd propose a "binary" for ed2 (rgit) that is the shell script to actually execute the singularity image. The trick will be to make sure we have the right folder included. The main folder that will need to be mounted is the current working directory which contains the ED2IN and config.xml files.

dlebauer commented 4 years ago

@robkooper should we use singularity to call model2netcdf.ed at the end of job.sh? Or can we stick with the existing approach (which presumably requires installing PEcAn.ED on the remote)?

https://github.com/PecanProject/pecan/blob/ec65c3c47a080729bd31832ca258c2e1fbe96dfb/models/ed/inst/template.job#L51

mdietze commented 4 years ago

From an install perspective, I think it'd be much simpler to have model2netcdf within the model container, not outside of it. Plus then the container only needs to return the pecan standard output

github-actions[bot] commented 3 years ago

This issue is stale because it has been open 365 days with no activity.

dlebauer commented 3 years ago

@KristinaRiemer and @julianpistorius what is the status of this issue?

KristinaRiemer commented 3 years ago

We have the ed2_2.2.9_singularity.sh, I'm not sure if that's sufficient for this?

dlebauer commented 3 years ago

here is what is in the file kristina linked to https://github.com/az-digitalag/model-vignettes/blob/34a0d48f7fcb2dfe86e5b3ae828014d17e4fa371/ED2/ed2_2.2.0_singularity.sh


#!/bin/bash

module load singularity
pwd
singularity run -B ~/pecan/sites:/data/sites \
-B ~/pecan/inputs/ed_inputs:/data/ed_inputs \
-B ~/pecan/inputs/faoOLD:/data/faoOLD \
-B ~/pecan/inputs/oge2OLD:/data/oge2OLD \
-B ~/pecan/tests/ed2:/data/tests/ed2 \
~/pecan/pecan-model-ed2.sif /usr/local/bin/ed.2.2.0 -s

but the job.sh also needs the line that calls model2netcdf.ED2 - is there a separate issue for that? In any case, we should get our worked checked in as soon as it is viable.

Aariq commented 1 year ago

job.sh file is missing module load R. I think that might be it. Here's the relevant section of a job.sh currently:

  # convert to MsTMIP
  Rscript \
    -e "library(PEcAn.ED2)" \
    -e "model2netcdf.ED2('/groups/dlebauer/ed2_results/pecan_remote/2022-09-19-18-42-41/out/ENS-00042-1000000009', 35.8031, -76.6679, '2009-01-01', '2015-12-31', c('SetariaWT','temperate.Southern_Pine'))"
  STATUS=$?

And the resulting error in the logfile.txt:

./job.sh: line 52: Rscript: command not found
robkooper commented 1 year ago

If you look at the config.php, you can see how that works from the UI:

$hostlist=array($fqdn => array(),
                "geo.bu.edu" =>
                    array("displayname" => "geo",
                          "qsub"        => "qsub -V -N @NAME@ -o @STDOUT@ -e @STDERR@ -S /bin/bash",
                          "jobid"       => "Your job ([0-9]+) .*",
                          "qstat"       => "qstat -j @JOBID@ || echo DONE",
                          "prerun"      => "module load udunits R/R-3.0.0_gnu-4.4.6",
                          "postrun"     => "sleep 60",
                          "models"      =>
                              array("ED2" =>
                                        array("prerun"  => "module load hdf5"),
                                    "ED2 (r82)" =>
                                        array("prerun"  => "module load hdf5")
                              )
                          )
                );

This is then inserted in the pecan.xml at (for the model specific prerun):

<model>
  <prerun>.....</prerun
</model>

and for the host prerun:

<host>
  <prerun>....</prerun>
</host>

and will then get inserted in the job.sh

dlebauer commented 1 year ago

@Aariq maybe need to uncomment these lines so that the PEcAn.ED2 package is installed? https://github.com/PecanProject/pecan/blob/develop/models/ed/Dockerfile#L52

Then in postrun tag you can call something like a postrun.sh that looks something like:

#!/bin/bash

module load singularity
singularity run ~/pecan/pecan-model-ed2.sif /usr/local/bin/Rscript -e 
       "PEcAn.ED2::model2netcdf.ED2('/groups/dlebauer/ed2_results/pecan_remote/2021-09-13-18-49-11/out/ENS-00008-76', 
                               40.0637, -88.202, '2004/04/01', '2004/08/31', c('SetariaWT','ebifarm.c3grass'))"

Not sure the easiest way to generate this per-run (could add it to the job.sh? read parameters from config.xml in run directory?)

robkooper commented 1 year ago

since it starts from pecan/models, this include all model R files.

Aariq commented 1 year ago

Currently it looks like we're using a tag to add module load openmpi3, so I think that can just be changed to module load openmpi3 R. That would hopefully at least change the error from "command Rscript not found" to something like "PEcAn.ED2 isn't installed". Then I can try uncommenting those lines in the Dockerfile.

robkooper commented 1 year ago

Following is the dependency graph:

rocker/tidyverse

pecan/depends

pecan/base

pecan/models

pecan/model-ed2-2.2.0

So I think PEcAn.ED2 should be installed.

Aariq commented 1 year ago

With <job.sh>module load openmpi3 R</job.sh> I now get this in the logfile:

Error in library(PEcAn.ED2) : there is no package called ‘PEcAn.ED2’
Execution halted
ERROR IN model2netcdf.ED2

So I think either PEcAn.ED2 isn't installed in the container or the code in job.sh isn't running inside the container. Looking through the discussion on this issue it looks like the plan was to have model2netcdf.ED2() run inside the singularity container, but not clear to me if it was implemented.

robkooper commented 1 year ago

can you make sure that the R path is correct?

Aariq commented 1 year ago

Tried this manual edit to job.sh like @dlebauer suggested:

# convert to MsTMIP
  module load singularity
  singularity run ~/pecan/pecan-model-ed2.sif /usr/local/bin/Rscript \
    -e "library(PEcAn.ED2)" \
    -e "model2netcdf.ED2('/groups/dlebauer/ed2_results/pecan_remote/2022-09-21-22-51-12/out/ENS-00001-76', 40.0637, -88.202, '2004-07-01', '2004-08-01', c('SetariaWT','ebifarm.c3grass'))"

Still results in a there is no package called PEcAn.ED2 error in the logfile.

Aariq commented 1 year ago

Whoops, I fixed the path to the .sif file to be absolute and it does now run model2netcdf.ED2(). So this theoretically works, but need to think about design stuff to make this work without manually editing job.sh

robkooper commented 1 year ago

talking to @Aariq

thinking is to add new parameter to modellauncer called job, modify the code at https://github.com/PecanProject/pecan/blob/develop/base/workflow/R/start_model_runs.R#L117 to pass that parameter to setup and write it at https://github.com/PecanProject/pecan/blob/develop/base/workflow/R/start_model_runs.R#L117 to the model launcher script

No longer change the binary in the job.sh, leave that as is.

create shell script to run singularity:

#!/bin/bash

module load singularity
pwd
singularity run --pwd=$(pwd) --contain \
    -B /groups/dlebauer/ed2_results/pecan_remote:/groups/dlebauer/ed2_results/pecan_remote \
    -B /groups/dlebauer/ed2_results/inputs/julianp/sites:/data/sites \
    -B /groups/dlebauer/ed2_results/inputs/julianp/ed_inputs:/data/ed_inputs \
    -B /groups/dlebauer/ed2_results/inputs/julianp/faoOLD:/data/faoOLD \
    -B /groups/dlebauer/ed2_results/inputs/julianp/oge2OLD:/data/oge2OLD \
    -B /groups/dlebauer/ed2_results/inputs/julianp/tests/ed2:/data/tests/ed2 \
    /groups/dlebauer/ed2_results/global_inputs/pecan-dev_ed2-dev.sif \
      ./job.sh

we don't change the binary for the model in pecan.xml, that should be /usr/local/bin/ed2.git