Open wildintellect opened 4 weeks ago
@wildintellect, are you sure the code snippet is correct? I think there are at least 2 problems: (1) mkdir
created an output dir before the outdir
var is defined, and (2) the last line uses outdir
which simply translates as the literal string "outdir"
, so it should change to $outdir
(or ${outdir}
).
I think this is what you want:
#!/bin/bash
basedir=$( cd "$(dirname "$0")" ; pwd -P )
outdir=${PWD}/output
mkdir -p "${outdir}"
country=${1}
#Rscript global_process_part1_2024_MAAP_step123.R $country
cd ${basedir}
conda run --no-capture-output --name gedi_pa_env Rscript global_process_part3_2024_MAAP_step5.r $country "${outdir}"
I've also surrounded ${outdir}
in double quotes in both spots, just to avoid potential problems with space in the path.
(1) works either way, you're right that setting the var 1st is probably best practices (2) I don't know how that typo crept in , I didn't give it to the user that way
So yes @chuckwondo 's example is technically cleaner
(1) works either way, you're right that setting the var 1st is probably best practices
It doesn't work when PWD is not the directory in which the script is sitting.
In other words, these 2 lines are not equivalent when PWD is not the directory containing the script containing these lines:
mkdir -p output
mkdir -p "${PWD}/output"
In the context of DPS jobs, we should always use "${PWD}/output"
and "${PWD}/input"
as the locations of the output and input directories that the DPS system expects us to use.
There is also a typo in the original code in the first comment of this issue (ouput
should be output
), so here's what should be a fully corrected example, with the addition of an input dir variable, for reference:
#!/bin/bash
basedir=$( cd "$(dirname "$0")" ; pwd -P )
# If your `algorithm_config.yaml` file has any inputs configured as type `file` (rather
# than `positional`), the DPS will automatically download such inputs (which must be
# URLs) and place the downloaded files into the input directory.
indir=${PWD}/input
# All outputs must be written to the output directory. We must create the full path to
# the output directory (including parent directories using the `-p` flag).
outdir=${PWD}/output
mkdir -p "${outdir}"
# Collect positional inputs
country=${1}
#Rscript global_process_part1_2024_MAAP_step123.R ${country}
cd "${basedir}"
conda run --no-capture-output --name gedi_pa_env Rscript global_process_part3_2024_MAAP_step5.r ${country} "${outdir}"
Turns out passing the
output
directory location to scripts is trickier than outlined in https://docs.maap-project.org/en/latest/technical_tutorials/dps_tutorial/dps_tutorial_demo.html#Output-folder Q: why do users need to make the output folder, if it's always needed as the correct location for outputs, should it exist already? @sujen1412Specifically, this does not show how to pass the correct path to scripts, e.g. python and R. Also is the case of R, R is very sensitive to working directories when importing other R files, so the script can't always be run from the default path DPS starts at, which means the output folder is actually up one level from where the script executes.
Here's an R example (We're going to need R examples on par with the python dps_tutorial)