Open landmanbester opened 2 months ago
Yep, and you'll also need to add it to the bind paths in the singularity backend options. Just as a workaround for now.
My proposed sustainable solution:
add it as an optional Directory-type input (writable: true) of the cab
add a new policy to stimela, i.e. policy: pass_as_envvar: NAME_OF_ENVVAR
.
Then these sorts of things could be available across all backends uniformly.
I tried the above but it doesn't seem like the environment variable is getting passed through correctly. If I look at the dumped config file I see following for the backend settings
opts:
backend:
default_registry: quay.io/stimela2
override_registries: {}
select: singularity
singularity:
enable: true
image_dir: /home/bester/.singularity
auto_build: true
rebuild: true
executable: null
remote_only: false
bind_dirs:
/home/bester/projects/ESO137: rw
If I look at the cab that gets invoked I also see the following under the management
section
management:
environment:
NUMBA_CACHE_DIR: /home/bester/projects/ESO137/numba_cache
cleanup: {}
wranglers: {}
But printing the numba cache dir from inside a worker produces
# Numba cache =
which means it hasn't been set. I guess I could dive into the singularity backend but wouldn't even be sure what to look for. Any ideas?
Hilariously, I ran into this during a pipeline run yesterday. I have implemented a very basic fix in the isse334-basic-fix
branch. This adds env
to the SingularityBackendOptions
, and it functions in the same way as the kube
backend. This is what it looks like in my recipe:
selfcal-1:
info: |
Use quartical to perform basic selfcal. Solves for a delay and phase
term per scan. Note that the selfcal step may require tuning based
on the field and instrument in question.
_use: lib.steps.quartical.k
backend:
singularity:
bind_dirs:
/home/kenyon/numba_cache_dir: rw
env:
NUMBA_CACHE_DIR: /home/kenyon/numba_cache_dir
params:
K.time_interval: 4
I appreciate that this may not be the best solution, but it is a simple one which can be used to check that this is the root cause of the original error.
In principle, all backends should likely support an env
parameter. The environment
field in the cab management
section doesn't seem to actually be used anywhere inside stimela
at present. I would argue that it feels more natural to set these as part of the backend settings than by modifying the cab.
Edit: I am still rerunning the recipe to see if this has solved the problem.
The above also has the advantage of being easily configurable for both the entire recipe and per step.
This does seem to have fixed the issue for me (assuming it isn't intermittent).
Awesome, this seems to have fixed the problem for me. Thanks @JSKenyon
The environment field in the cab management section doesn't seem to actually be used anywhere inside stimela at present. I would argue that it feels more natural to set these as part of the backend settings than by modifying the cab.
Yeah the management: environment
field was inherited from old Stimela but not yet implemented. Off the top of my head, I do see three categories of environment variables:
Things that are actually task inputs masquerading as environment variables. The numba cache dir is arguably one of them. I think these should be handled by defining them as an input/output, with a special policy. The benefit of this is that the various backends needing to mount directories (singularity, kube) then know about this explicitly.
Environment settings needed to massage the tool into running properly. CASA used to require these. management: environment
was actually put in for this purpose.
Backend-specific environment settings.
We have this option for the kube backend but not for singularity. It is useful for things like caches. In my case I ran into
which I am assuming happens because numba is caching directly to the singularity image and this must have some sort of limit set? The workaround I am testing is to merge in a separate config which sets
Is this the correct thing to do?