DataBiosphere / toil

A scalable, efficient, cross-platform (Linux/macOS) and easy-to-use workflow engine in pure Python.
http://toil.ucsc-cgl.org/.
Apache License 2.0
894 stars 241 forks source link

CWL: Too many bind mounts in Singularity. #3358

Open DailyDreaming opened 3 years ago

DailyDreaming commented 3 years ago

Discussion and original issue is here: https://cwl.discourse.group/t/too-many-arguments-on-the-command-line/248/2

Short summary: If I use as input to a step and I put a directory array (Directory[]) and I run the workflow with singularity it happens that if the file list is too long I get a Too many arguments on the command line. I am currently running the workflow with Toil.

[2020-12-02T11:21:48+0100] [MainThread] [W] [toil.leader] The job seems to have left a log file, indicating failure: 'file:///project/astroneosc/Software/prefactor3-cwl/lofar-cwl/steps
/check_ateam_separation.cwl#check_ateam_separation' python3 /usr/local/bin/check_Ateam_separation.py kind-file_project_astroneosc_Software_prefactor3-cwl_lofar-cwl_steps_check_ateam_se
paration.cwl_check_ateam_separation/instance-r0brc7sq
[2020-12-02T11:21:48+0100] [MainThread] [W] [toil.leader] Log from job kind-file_project_astroneosc_Software_prefactor3-cwl_lofar-cwl_steps_check_ateam_separation.cwl_check_ateam_separ
ation/instance-r0brc7sq follows:
=========>
        /table.dat:ro \
            --bind \
            /project/astroneosc/Data/tmp/node-70e26f65-197b-49f9-90aa-52b42e8d7822-4b184c8e-e9fd-4784-92c4-5ace3fd7ef2c/tmp074diqpq/31cdf995-536c-4d07-9b48-c72e6df42315/tmpsvnitjnw.tmp
:/var/lib/cwl/stga6073e5c-0ba5-472f-9790-6480440e0258/L755125_SB222_uv.MS/table.f4_TSM0:ro \
            --bind \
            /project/astroneosc/Data/tmp/node-70e26f65-197b-49f9-90aa-52b42e8d7822-4b184c8e-e9fd-4784-92c4-5ace3fd7ef2c/tmp074diqpq/31cdf995-536c-4d07-9b48-c72e6df42315/tmpfw7whow7.tmp
:/var/lib/cwl/stga6073e5c-0ba5-472f-9790-6480440e0258/L755125_SB222_uv.MS/DATA_DESCRIPTION/table.info:ro \
            --bind \
            /project/astroneosc/Data/tmp/node-70e26f65-197b-49f9-90aa-52b42e8d7822-4b184c8e-e9fd-4784-92c4-5ace3fd7ef2c/tmp074diqpq/31cdf995-536c-4d07-9b48-c72e6df42315/tmp1o7zp770.tmp
:/var/lib/cwl/stga6073e5c-0ba5-472f-9790-6480440e0258/L755125_SB222_uv.MS/DATA_DESCRIPTION/table.f0:ro \
            --bind \
            /project/astroneosc/Data/tmp/node-70e26f65-197b-49f9-90aa-52b42e8d7822-4b184c8e-e9fd-4784-92c4-5ace3fd7ef2c/tmp074diqpq/31cdf995-536c-4d07-9b48-c72e6df42315/tmp8clmv0ww.tmp
:/var/lib/cwl/stga6073e5c-0ba5-472f-9790-6480440e0258/L755125_SB222_uv.MS/DATA_DESCRIPTION/table.dat:ro \
            --bind \
            /project/astroneosc/Data/tmp/node-70e26f65-197b-49f9-90aa-52b42e8d7822-4b184c8e-e9fd-4784-92c4-5ace3fd7ef2c/tmp074diqpq/31cdf995-536c-4d07-9b48-c72e6df42315/tmpj09nien5.tmp
:/var/lib/cwl/stga6073e5c-0ba5-472f-9790-6480440e0258/L755125_SB222_uv.MS/table.f0:ro \
            --bind \
            /project/astroneosc/Data/tmp/node-70e26f65-197b-49f9-90aa-52b42e8d7822-4b184c8e-e9fd-4784-92c4-5ace3fd7ef2c/tmp074diqpq/31cdf995-536c-4d07-9b48-c72e6df42315/tmpnmj0wmwj.tmp
:/var/lib/cwl/stga6073e5c-0ba5-472f-9790-6480440e0258/L755125_SB222_uv.MS/QUALITY_FREQUENCY_STATISTIC/table.info:ro \
            --bind \
            /project/astroneosc/Data/tmp/node-70e26f65-197b-49f9-90aa-52b42e8d7822-4b184c8e-e9fd-4784-92c4-5ace3fd7ef2c/tmp074diqpq/31cdf995-536c-4d07-9b48-c72e6df42315/tmp3xtw2kh7.tmp
[...]
--pwd \
            /vWWYEQ \
            /project/astroneosc/Software/prefactor3.simg \
            python3 \
            /usr/local/bin/check_Ateam_separation.py \
            /var/lib/cwl/stg7dbf09a8-5fa9-48ed-b4c2-fe2eb29f266a/L755125_SB000_uv.MS \
            /var/lib/cwl/stgf721eb4f-99dd-4b87-8fe6-9e250a7317ce/L755125_SB002_uv.MS \
            /var/lib/cwl/stg5b01d833-66e4-4dcd-9877-d33c7b7cd5b9/L755125_SB005_uv.MS \
            /var/lib/cwl/stg55dc76f5-ce12-462c-bffd-1f6d2e4d66bb/L755125_SB004_uv.MS \
            /var/lib/cwl/stgf8d851aa-9737-4120-af95-241e53ef984b/L755125_SB006_uv.MS \
            /var/lib/cwl/stg2d1ebe28-a26f-4442-a4f7-e9fb177653fe/L755125_SB007_uv.MS \
            /var/lib/cwl/stgf4f9f1b2-855c-433f-87fc-f9b57e69f060/L755125_SB013_uv.MS \
            /var/lib/cwl/stgfbef4e7b-b90e-49eb-bb24-bc9074dec3ef/L755125_SB010_uv.MS \
[...]
 --min_separation \
            30 \
            --outputimage \
            Ateam_separation.png > /project/astroneosc/Data/tmp/node-70e26f65-197b-49f9-90aa-52b42e8d7822-4b184c8e-e9fd-4784-92c4-5ace3fd7ef2c/tmp074diqpq/31cdf995-536c-4d07-9b48-c72e6
df42315/tu3w97mbq/tmp-outcd3vx2cf/Ateam_separation.log
        [2020-12-02T11:21:43+0100] [MainThread] [E] [cwltool] Exception while running job
        Traceback (most recent call last):
          File "/home/astroneosc-mmancini/.local/lib/python3.6/site-packages/cwltool/job.py", line 394, in _execute
            default_stderr=runtimeContext.default_stderr,
          File "/home/astroneosc-mmancini/.local/lib/python3.6/site-packages/cwltool/job.py", line 955, in _job_popen
            universal_newlines=True,
          File "/usr/lib64/python3.6/subprocess.py", line 729, in __init__
            restore_signals, start_new_session)
          File "/usr/lib64/python3.6/subprocess.py", line 1364, in _execute_child
            raise child_exception_type(errno_num, err_msg, err_filename)
        OSError: [Errno 7] Argument list too long: 'singularity'
        [2020-12-02T11:21:43+0100] [MainThread] [W] [cwltool] [job check_ateam_separation] completed permanentFail
        [2020-12-02T11:21:45+0100] [MainThread] [W] [toil.fileStores.abstractFileStore] LOG-TO-MASTER: Job used more disk than requested. Consider modifying the user script to avoid th
e chance of failure due to incorrectly requested resources. Job files/for-job/kind-CWLWorkflow/instance-rcfqyxlv/cleanup/file-5wk8511s/stream used 2725.25% (81.8 GB [87786401792B] used
, 3.0 GB [3221225472B] requested) at the end of its run.
        Traceback (most recent call last):
          File "/home/astroneosc-mmancini/.local/lib/python3.6/site-packages/toil/worker.py", line 368, in workerScript
            job._runner(jobGraph=jobGraph, jobStore=jobStore, fileStore=fileStore, defer=defer)
          File "/home/astroneosc-mmancini/.local/lib/python3.6/site-packages/toil/job.py", line 1424, in _runner
            returnValues = self._run(jobGraph, fileStore)
          File "/home/astroneosc-mmancini/.local/lib/python3.6/site-packages/toil/job.py", line 1361, in _run
            return self.run(fileStore)
          File "/home/astroneosc-mmancini/.local/lib/python3.6/site-packages/toil/cwl/cwltoil.py", line 988, in run
            raise cwltool.errors.WorkflowException(status)
        cwltool.errors.WorkflowException: permanentFail
        [2020-12-02T11:21:45+0100] [MainThread] [E] [toil.worker] Exiting the worker because of a failed job on host wn-db-02.novalocal

┆Issue is synchronized with this Jira Story ┆Issue Number: TOIL-738

mr-c commented 3 years ago

Some way of invoking singularity with an arbitrary number of binds

I think the SINGULARITY_BIND option recommended by @matmanc should work. Here's a first attempt at that: https://github.com/common-workflow-language/cwltool/pull/1386

mr-c commented 3 years ago

@DailyDreaming it was pointed out by @tetron that my approach won't work, it will run into the same E2BIG: https://github.com/common-workflow-language/cwltool/pull/1386#issuecomment-739333597

A better solution overall would be to create a hardlink tree so that only the base directory needs to be mounted into the container.

DailyDreaming commented 3 years ago

@mr-c @tetron I'm testing on a scaled down version of mantmanc's example test https://cwl.discourse.group/t/too-many-arguments-on-the-command-line/248/20 (changing for i in {1..2024} -> for i in {1..20} in create_file.cwl).

cwltool seems to run this successfully in about a minute, while toil takes 24 minutes to fail.

Toil adding batching by directory may help, and makes sense to me. I brought this up with @adamnovak , and he said that they already batch directories for toil in vg and sent the following link: https://github.com/vgteam/toil-vg/blob/295ea704cf64e8673a21a04fcf063ce0ee08d29f/src/toil_vg/iostore.py#L79

I think we should try to skip the compression, but I think implementing directory batching may help solve both:

  1. Speeding up toil's import of overly populous directories.
  2. Submitting a less verbose arg set of bind mounts.

I think on both the toil and cwltool side, we should attempt to un-restrict the 8mb heap limit when allowed to do so and thus the 1/4 limit size (2mb) to CLI commands, especially since @tetron mentioned his research lead him to believe that the env vars share the same memory space: https://github.com/common-workflow-language/cwltool/pull/1386#issuecomment-739333597 .

tetron commented 3 years ago

Seems like tarring it up and then untarring it later amounts to the same amount of I/O as copying all the files out of file store to reconstruct the directory. I don't think using a tar file is a good general solution. Copying into a temporary directory tree is easy, it use hard links or symlinks if we want to get a bit more clever.

On Tue, Dec 8, 2020, at 7:15 PM, Lon Blauvelt wrote:

@mr-c https://github.com/mr-c @tetron https://github.com/tetron I'm testing on a scaled down version of mantmanc's example test https://cwl.discourse.group/t/too-many-arguments-on-the-command-line/248/20 (changing for i in {1..2024} -> for i in {1..20} in create_file.cwl).

cwltool seems to run this successfully in about a minute, while toil takes 24 minutes to fail.

Toil adding batching by directory may help, and makes sense to me. I brought this up with @adamnovak https://github.com/adamnovak , and he said that they already batch directories for toil in vg and sent the following link: https://github.com/vgteam/toil-vg/blob/295ea704cf64e8673a21a04fcf063ce0ee08d29f/src/toil_vg/iostore.py#L79

I think we should try to skip the compression, but I think implementing directory batching may help solve both:

  1. Speeding up toil's import of overly populous directories.
  2. Submitting a less verbose arg set of bind mounts. I think on both the toil and cwltool side, we should attempt to un-restrict the 8mb heap limit when allowed to do so and thus the 1/4 limit size (2mb) to CLI commands, especially since @tetron https://github.com/tetron mentioned his research lead him to believe that the env vars share the same memory space: common-workflow-language/cwltool#1386 (comment) https://github.com/common-workflow-language/cwltool/pull/1386#issuecomment-739333597 .

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/DataBiosphere/toil/issues/3358#issuecomment-741293001, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAKBOBBAQF5K6FAEJK44LH3ST26TJANCNFSM4UN7RT2Q.

DailyDreaming commented 3 years ago

@tetron Hmmm... I'll defer to your intuition, and I agree that if it were simply writing the tar vs. writing the recursive directory, it would be roughly the same io. I was just thinking that in Toil/python, we size each file and convert to a FileID object, before writing each file to the jobstore individually and I was hoping that dropping that overhead (especially sizing the individual files) might improve things.

If I understand correctly, the temporary directory tree would only need to be implemented on the cwltool side? I assume this would look something like (very roughly):

associations = dict()
associated_tmp_dirs = dict()

for src, dst, read_write in locations:

    # make a unique tmpdir for each basedir being mounted
    if os.path.basedir(src) not in associated_tmp_dirs:
        temp_dir = mktempdir()
        associated_tmp_dirs[os.path.basedir(src)] = temp_dir
    else:
        temp_dir = associated_tmp_dirs[os.path.basedir(src)]

    associations[src] = dict('src_dir': temp_dir,
                                         'dst': dst)

    if f'{os.path.basedir(src)}:{temp_dir}:{read_write}' not in already_existing_bind_mount_args:
        add_bind_mount(f'{os.path.basedir(src)}:{temp_dir}:{read_write}')

run_hard_link_from_tmp_dir_to_real_locations_inside_of_container(associations)

I'll try to open a PR to this effect.

DailyDreaming commented 3 years ago

@tetron Will try to push the PR sometime tomorrow. Right now I'm attempting to group files with a common basedir together, create a tempdir, hardlink the files into the tempdir, and then bind mount a minimal set of tempdirs (and original dirs if they weren't files) to the file directories that Singularity originally wanted to find the files in.

ionox0 commented 3 years ago

We have hit this issue as well, and I was wondering if it might be easier to bind the whole jobstore and workdir folders, instead of binding each file inside of them individually.