Closed khurtado closed 4 years ago
hi @khurtado can you just try replacing /home/extract
with /tmp/extract
and see if that works?
@lukasheinrich: Changing /home/extract->/tmp/extract
in the workflow steps didn't work. The problem is that the directories are not only defined in the yaml files, but are hardcoded in the python scripts that come inside the image. The steps also assume the starting directory is /home
, so calls to relative paths there like here fail.
See some examples below: https://github.com/scailfin/workflow-madminer/blob/master/docker/docker-madminer-physics/code/delphes.py#L82 https://github.com/scailfin/workflow-madminer/blob/master/docker/docker-madminer-physics/code/configurate.py#L93 https://github.com/scailfin/workflow-madminer/blob/master/docker/docker-madminer-physics/code/generate.py#L63-L79
hi this will require a rebuild of the image, i'm not sure how pervasive it is but usually being able to change this using a env variable i.e. replacing with os.environ['MADMINER_DATA']
or similar.
@lukasheinrich I agree using environment variables on the scripts and settings those in the workflows makes sense. Is there anybody in the NYU scailfin group maintaining this example be willing to make the changes and rebuilding image for that purpose?
Normally this wold be @irinaespejo, but I'm not sure if she is available right now. Thank you for testing and finding the issues. I'll see if I can find someone to help. Possibly @heikomuller @alexanderheld ?
It's lines like this that are the problem, right? https://github.com/scailfin/workflow-madminer/blob/master/docker/docker-madminer-ml/code/configurate_ml.py#L65
@cranmer Correct!
Hi @khurtado, thank you for your detailed report and apologies for the late reply, I've been away for a while. You're very right in the problem with the paths using Singularity, Lukas' suggestion is helpful and I'll go that way. I'm working on it right now so that we don't have problems like this in the future. I'll get back to you when it's done, thanks for the patience!
@irinaespejo Thank you!
Hi @khurtado I've made changes and I've tried them myself using Singularity and it works. I decided the easiest was to put everything on a separate folder called /madminer
to avoid permission problems on /home
. Let me know if you have any problems and what you think about the solution. Thanks for your interest.
@irinaespejo Awesome! Thank you, I will test this week and let you know how things go for me.
Hi @irinaespejo
I tested today, but got errors due to /madminer/data
being in read-only mode.
Traceback (most recent call last):
File "/madminer/code/configurate.py", line 93, in <module>
miner.save('/madminer/data/madminer_example.h5')
File "/usr/local/lib/python2.7/dist-packages/madminer/core.py", line 527, in save
create_missing_folders([os.path.dirname(filename)])
File "/usr/local/lib/python2.7/dist-packages/madminer/utils/various.py", line 55, in create_missing_folders
os.makedirs(folder)
File "/usr/lib/python2.7/os.py", line 157, in makedirs
mkdir(name, mode)
OSError: [Errno 30] Read-only file system: '/madminer/data'
cp: cannot stat '/madminer/data/*.h5': No such file or directory
[Error] Execution failed with error code: 1
Hi @khurtado, sad to hear that. I can't reproduce your error, could you post the commands you're using please? Also, you were using REANA + HTCondor + Singularity, does that still apply? Thanks!
@irinaespejo I'm running it via VC3 with REANA + HTCondor + Singularity, but I get similar results executing yadage-run alone in the following way:
export PACKTIVITY_CONTAINER_RUNTIME=singularity
export SINGULARITY_CACHEDIR="/tmp/$(whoami)/singularity"
export LC_ALL=en_US.utf-8
export LANG=en_US.utf-8
mkdir demo; cd demo
git clone https://github.com/scailfin/workflow-madminer
cd workflow-madminer/example-full
yadage-run workdir workflow.yml -p inputfile='"inputs/input.yml"' -p njobs="6" -p ntrainsamples="2" -d initdir=$PWD --visualize
Once the above fails, the file in workdir/configurate/_packtivity/configurate.run.log
has the following:
2019-09-09 16:58:09,580 | pack.configurate.run | INFO | starting file logging for topic: run
2019-09-09 16:58:16,705 | pack.configurate.run | INFO | inputfile: /home/khurtado/demos/workflow-madminer/example-full/inputs/input.yml
2019-09-09 16:58:16,710 | pack.configurate.run | INFO | Traceback (most recent call last):
2019-09-09 16:58:16,710 | pack.configurate.run | INFO | File "/madminer/code/configurate.py", line 93, in <module>
2019-09-09 16:58:16,711 | pack.configurate.run | INFO | miner.save('/madminer/data/madminer_example.h5')
2019-09-09 16:58:16,711 | pack.configurate.run | INFO | File "/usr/local/lib/python2.7/dist-packages/madminer/core.py", line 527, in save
2019-09-09 16:58:16,711 | pack.configurate.run | INFO | create_missing_folders([os.path.dirname(filename)])
2019-09-09 16:58:16,711 | pack.configurate.run | INFO | File "/usr/local/lib/python2.7/dist-packages/madminer/utils/various.py", line 55, in create_missing_folders
2019-09-09 16:58:16,712 | pack.configurate.run | INFO | os.makedirs(folder)
2019-09-09 16:58:16,712 | pack.configurate.run | INFO | File "/usr/lib/python2.7/os.py", line 157, in makedirs
2019-09-09 16:58:16,712 | pack.configurate.run | INFO | mkdir(name, mode)
2019-09-09 16:58:16,712 | pack.configurate.run | INFO | OSError: [Errno 30] Read-only file system: '/madminer/data'
2019-09-09 16:58:16,857 | pack.configurate.run | INFO | cp: cannot stat '/madminer/data/*.h5': No such file or directory
A single singularity command with the error would be:
$ singularity exec -C -B /home:/home --pwd /tmp/_sing_home_X02HH2/f6BoUL -H /tmp/_sing_home_X02HH2 docker://madminertool/docker-madminer-physics:latest sh -c 'mkdir /madminer/data'
mkdir: cannot create directory '/madminer/data': Read-only file system
Hi @irinaespejo, did you get a chance to look into this? Let me know if there is anything else you need to reproduce the problem. Thanks!
Hi @khurtado, sorry I've been a bit busy. I was able to reproduce your error. I have an idea of what might work, I'll let you know how that turns out. Thank you!
Hi @khurtado, could you try again the commands you posted and see if you still have the error? Thanks!
Hi @irinaespejo . Still the same error. I made sure to clean the singularity cache. Has anything changed in the code, though? I haven't noticed any new commit in this repo since Sep 17/18.
2019-09-24 14:12:29,065 | pack.configurate.run | INFO | File "/usr/local/lib/python2.7/dist-packages/madminer/utils/various.py", line 55, in create_missing_folders
2019-09-24 14:12:29,065 | pack.configurate.run | INFO | os.makedirs(folder)
2019-09-24 14:12:29,065 | pack.configurate.run | INFO | File "/usr/lib/python2.7/os.py", line 157, in makedirs
2019-09-24 14:12:29,065 | pack.configurate.run | INFO | mkdir(name, mode)
2019-09-24 14:12:29,066 | pack.configurate.run | INFO | OSError: [Errno 30] Read-only file system: '/madminer/data'
2019-09-24 14:12:29,167 | pack.configurate.run | INFO | cp: cannot stat '/madminer/data/*.h5': No such file or directory
Hi @irinaespejo . Have you had a chance to look into this? Let me know if there is anything I can help with.
@irinaespejo Just ping about this to keep the thread alive :)
Hi @irinaespejo
I'm trying to execute interactively. I get things running up to the combine step, but then sampling gives me the error below. Have you seen this? I had to revert to madminer to 0.5.0, because otherwise, delphes would complain about systematics.
EDIT: Oh, it seems the combine script only ended up copying the first delphes file in the list. Why wasn't combine_and_shuffle
used to combine all delphes files?
$ echo $data_file
/reana/users/00000000-0000-0000-0000-000000000000/workflows/test/combine/combined_delphes.h5
$ echo $input_file
/reana/users/00000000-0000-0000-0000-000000000000/workflows/test/inputs/input.yml
$ python configurate_ml.py 1 $data_file $input_file
['sally', 'alices', 'alice']
12:17 madminer.analysis INFO Loading data from /reana/users/00000000-0000-0000-0000-000000000000/workflows/test/combine/combined_delphes.h5
12:17 madminer.analysis WARNING Inconsistent event numbers in HDF5 file! Please recalculate them by calling combine_and_shuffle(recalculate_header=True).
12:17 madminer.analysis INFO Found 2 parameters
12:17 madminer.analysis INFO Did not find nuisance parameters
12:17 madminer.analysis INFO Found 6 benchmarks, of which 6 physical
12:17 madminer.analysis INFO Found 2 observables
12:17 madminer.analysis INFO Found 5823 events
12:17 madminer.analysis INFO 982 signal events sampled from benchmark morphing_basis_vector_4
12:17 madminer.analysis INFO Found morphing setup with 6 components
12:17 madminer.analysis INFO Did not find nuisance morphing setup
sampling from method sally
12:17 madminer.sampling INFO Extracting training sample for local score regression. Sampling and score evaluation according to sm
12:17 madminer.sampling INFO Starting sampling serially
12:17 madminer.sampling INFO Sampling from parameter point 1 / 1
Traceback (most recent call last):
File "configurate_ml.py", line 189, in <module>
filename=method+'_train'
File "/usr/local/lib/python2.7/dist-packages/madminer/sampling.py", line 323, in sample_train_local
double_precision=double_precision,
File "/usr/local/lib/python2.7/dist-packages/madminer/sampling.py", line 1400, in _sample
double_precision=double_precision,
File "/usr/local/lib/python2.7/dist-packages/madminer/sampling.py", line 1509, in _sample_set
generated_close_to=None if not sample_only_from_closest_benchmark else theta_value_sampling,
File "/usr/local/lib/python2.7/dist-packages/madminer/analysis.py", line 331, in xsecs
generated_close_to=generated_close_to,
File "/usr/local/lib/python2.7/dist-packages/madminer/analysis.py", line 151, in event_loader
return_sampling_ids=return_sampling_ids,
File "/usr/local/lib/python2.7/dist-packages/madminer/utils/interfaces/madminer_hdf5.py", line 243, in madminer_event_loader
this_observations = this_observations[cut]
IndexError: boolean index did not match indexed array along dimension 0; dimension is 3494 but corresponding boolean dimension is 982
@irinaespejo Just so you know, the following changes work for me with singularity. I still need to check with shifter.
https://github.com/khurtado/workflow-madminer/commit/c4b3ac66820a3fe676c2d479f958346897640c20
Closing issue, after confirmation with @khurtado over Slack.
Hello,
I'm trying to test this example in REANA + HTCondor + Singularity and noticed some failures that I just wanted to report.
The steps assume the starting working directory will be /home, when calling code/ (example). So, each script either needs some
cd /home
at the beginning or using full pathsSteps that try to create directories in
/home
like here won't succeed unless the container is entered as root (docker case, but not singularity) , which result in Read-only file system errors like below. It would be great if the workflow could be adapted so that the directories created were done in the relative path of the working directory in the container (so it is up to the container technology invocation to make sure that relative path has write access), like in this example