apptainer / apptainer

Apptainer: Application containers for Linux
https://apptainer.org
Other
1.13k stars 139 forks source link

ValueError: NOT_FOUND: Could not open /input.bam [E::hts_open_format] Failed to open file "/input.bam" : No such file or directory #1109

Open Zero-Sun opened 1 year ago

Zero-Sun commented 1 year ago

Hi, I have such a problem when I run deepvariant with singularity. When I use the absolute path of the host, the following error is reported, and then I try to mount the path and it still doesn't work. please help me! ValueError: NOT_FOUND: Could not open Absolute_path/input.bam [E::hts_open_format] Failed to open file "Absolute_path/input.bam" : No such file or directory First I verify that my host directory is successfully mounted to the container directory/mnt/QJref.fa /mnt/input.bam

#/path1 is absolute path
WORK_DIR=/path1/4_Test/qingjiang/dpv 
singularity exec -B $TMPDIR:$TMPDIR,"${WORK_DIR}":/mnt /dellfsqd2/ST_OCEAN/USER/sunzhilong/1_Software/dpv/deepvariant_1.4.0.sif bash
Singularity> cd / && ls
bin  boot  dellfsqd2  dev  environment  etc  home  lib  lib32  lib64  libx32  media  mnt  opt  proc  root  run  sbin  singularity  srv  sys  tmp  usr  var
Singularity> cd mnt && ls
QJref.fa      input.bam      QJref.fa.fai    input.bam.bai        tmp_dir

Then I ran the following script.

cat test0215.sh
WORK_DIR=/path1/4_Test/qingjiang/dpv
export TMPDIR="$PWD/tmp_dir"
singularity run -B$TMPDIR:$TMPDIR,"${WORK_DIR}":/mnt \
/path1/1_Software/dpv/deepvariant_1.4.0.sif  /opt/deepvariant/bin/run_deepvariant \
  --num_shards=3 \
  --model_type=PACBIO \
  --ref=/mnt/QJref.fa \
  --reads=/mnt/input.bam \
  --output_vcf=/mnt/output.vcf.gz \
  --output_gvcf=/mnt/output.g.vcf.gz \
  --intermediate_results_dir /mnt/dpv \

The core error is ValueError: NOT_FOUND: Could not open /mnt/input.bam [E::hts_open_format] Failed to open file "/mnt/input.bam" : No such file or directoryHowever, I have verified the existence of /mnt/input.bam. The complete error information is as follows. Sincerely look forward to your help! thank you!

sh test0215.sh
I0216 00:56:07.446549 140582811191104 run_deepvariant.py:342] Re-using the directory for intermediate results in /mnt/dpv

***** Intermediate results will be written to /mnt/dpv in docker. ****

***** Running the command:*****
time seq 0 2 | parallel -q --halt 2 --line-buffer /opt/deepvariant/bin/make_examples --mode calling --ref "/mnt/QJref.fa" --reads "/mnt/input.bam" --examples "/mnt/dpv/make_examples.tfrecord@3.gz" --add_hp_channel --alt_aligned_pileup "diff_channels" --gvcf "/mnt/dpv/gvcf.tfrecord@3.gz" --max_reads_per_partition "600" --min_mapping_quality "1" --parse_sam_aux_fields --partition_size "25000" --phase_reads --pileup_image_width "199" --norealign_reads --sort_by_haplotypes --track_ref_reads --vsc_min_fraction_indels "0.12" --task {}

perl: warning: Setting locale failed.
perl: warning: Please check that your locale settings:
        LANGUAGE = (unset),
        LC_ALL = (unset),
        LC_CTYPE = "C.UTF-8",
        LANG = "en_US.UTF-8"
    are supported and installed on your system.
perl: warning: Falling back to the standard locale ("C").
perl: warning: Setting locale failed.
perl: warning: Please check that your locale settings:
        LANGUAGE = (unset),
        LC_ALL = (unset),
        LC_CTYPE = "C.UTF-8",
        LANG = "en_US.UTF-8"
    are supported and installed on your system.
perl: warning: Falling back to the standard locale ("C").
[E::hts_open_format] Failed to open file "/mnt/input.bam" : No such file or directory
Traceback (most recent call last):
  File "/path1/4_Test/qingjiang/dpv/tmp_dir/Bazel.runfiles_p4oo9k4b/runfiles/com_google_deepvariant/deepvariant/make_examples.py", line 180, in <module>
    app.run(main)
  File "/path1/4_Test/qingjiang/dpv/tmp_dir/Bazel.runfiles_p4oo9k4b/runfiles/absl_py/absl/app.py", line 300, in run
    _run_main(main, args)
  File "/path1/4_Test/qingjiang/dpv/tmp_dir/Bazel.runfiles_p4oo9k4b/runfiles/absl_py/absl/app.py", line 251, in _run_main
    sys.exit(main(argv))
  File "/path1/4_Test/qingjiang/dpv/tmp_dir/Bazel.runfiles_p4oo9k4b/runfiles/com_google_deepvariant/deepvariant/make_examples.py", line 166, in main
    options = default_options(add_flags=True, flags_obj=FLAGS)
  File "/path1/4_Test/qingjiang/dpv/tmp_dir/Bazel.runfiles_p4oo9k4b/runfiles/com_google_deepvariant/deepvariant/make_examples.py", line 128, in default_options
    samples_in_order, sample_role_to_train = one_sample_from_flags(
  File "/path1/4_Test/qingjiang/dpv/tmp_dir/Bazel.runfiles_p4oo9k4b/runfiles/com_google_deepvariant/deepvariant/make_examples.py", line 85, in one_sample_from_flags
    sample_name = make_examples_core.assign_sample_name(
  File "/path1/4_Test/qingjiang/dpv/tmp_dir/Bazel.runfiles_p4oo9k4b/runfiles/com_google_deepvariant/deepvariant/make_examples_core.py", line 134, in assign_sample_name
    with sam.SamReader(reads_filenames.split(',')[0]) as sam_reader:
  File "/path1/4_Test/qingjiang/dpv/tmp_dir/Bazel.runfiles_p4oo9k4b/runfiles/com_google_deepvariant/third_party/nucleus/io/genomics_reader.py", line 221, in __init__
    self._reader = self._native_reader(input_path, **kwargs)
  File "/path1/4_Test/qingjiang/dpv/tmp_dir/Bazel.runfiles_p4oo9k4b/runfiles/com_google_deepvariant/third_party/nucleus/io/sam.py", line 260, in _native_reader
    return NativeSamReader(input_path, **kwargs)
  File "/path1/4_Test/qingjiang/dpv/tmp_dir/Bazel.runfiles_p4oo9k4b/runfiles/com_google_deepvariant/third_party/nucleus/io/sam.py", line 227, in __init__
    self._reader = sam_reader.SamReader.from_file(
ValueError: NOT_FOUND: Could not open /mnt/input.bam
[E::hts_open_format] Failed to open file "/mnt/input.bam" : No such file or directory
[E::hts_open_format] Failed to open file "/mnt/input.bam" : No such file or directory
Traceback (most recent call last):
  File "/path1/4_Test/qingjiang/dpv/tmp_dir/Bazel.runfiles_q7goget_/runfiles/com_google_deepvariant/deepvariant/make_examples.py", line 180, in <module>
    app.run(main)
  File "/path1/4_Test/qingjiang/dpv/tmp_dir/Bazel.runfiles_q7goget_/runfiles/absl_py/absl/app.py", line 300, in run
    _run_main(main, args)
  File "/path1/4_Test/qingjiang/dpv/tmp_dir/Bazel.runfiles_q7goget_/runfiles/absl_py/absl/app.py", line 251, in _run_main
    sys.exit(main(argv))
  File "/path1/4_Test/qingjiang/dpv/tmp_dir/Bazel.runfiles_q7goget_/runfiles/com_google_deepvariant/deepvariant/make_examples.py", line 166, in main
    options = default_options(add_flags=True, flags_obj=FLAGS)
  File "/path1/4_Test/qingjiang/dpv/tmp_dir/Bazel.runfiles_q7goget_/runfiles/com_google_deepvariant/deepvariant/make_examples.py", line 128, in default_options
    samples_in_order, sample_role_to_train = one_sample_from_flags(
  File "/path1/4_Test/qingjiang/dpv/tmp_dir/Bazel.runfiles_q7goget_/runfiles/com_google_deepvariant/deepvariant/make_examples.py", line 85, in one_sample_from_flags
    sample_name = make_examples_core.assign_sample_name(
  File "/path1/4_Test/qingjiang/dpv/tmp_dir/Bazel.runfiles_q7goget_/runfiles/com_google_deepvariant/deepvariant/make_examples_core.py", line 134, in assign_sample_name
    with sam.SamReader(reads_filenames.split(',')[0]) as sam_reader:
  File "/path1/4_Test/qingjiang/dpv/tmp_dir/Bazel.runfiles_q7goget_/runfiles/com_google_deepvariant/third_party/nucleus/io/genomics_reader.py", line 221, in __init__
    self._reader = self._native_reader(input_path, **kwargs)
  File "/path1/4_Test/qingjiang/dpv/tmp_dir/Bazel.runfiles_q7goget_/runfiles/com_google_deepvariant/third_party/nucleus/io/sam.py", line 260, in _native_reader
    return NativeSamReader(input_path, **kwargs)
  File "/path1/4_Test/qingjiang/dpv/tmp_dir/Bazel.runfiles_q7goget_/runfiles/com_google_deepvariant/third_party/nucleus/io/sam.py", line 227, in __init__
    self._reader = sam_reader.SamReader.from_file(
ValueError: NOT_FOUND: Could not open /mnt/input.bam
Traceback (most recent call last):
  File "/path1/4_Test/qingjiang/dpv/tmp_dir/Bazel.runfiles_gc9nyr7p/runfiles/com_google_deepvariant/deepvariant/make_examples.py", line 180, in <module>
    app.run(main)
  File "/path1/4_Test/qingjiang/dpv/tmp_dir/Bazel.runfiles_gc9nyr7p/runfiles/absl_py/absl/app.py", line 300, in run
    _run_main(main, args)
  File "/path1/4_Test/qingjiang/dpv/tmp_dir/Bazel.runfiles_gc9nyr7p/runfiles/absl_py/absl/app.py", line 251, in _run_main
    sys.exit(main(argv))
  File "/path1/4_Test/qingjiang/dpv/tmp_dir/Bazel.runfiles_gc9nyr7p/runfiles/com_google_deepvariant/deepvariant/make_examples.py", line 166, in main
    options = default_options(add_flags=True, flags_obj=FLAGS)
  File "/path1/4_Test/qingjiang/dpv/tmp_dir/Bazel.runfiles_gc9nyr7p/runfiles/com_google_deepvariant/deepvariant/make_examples.py", line 128, in default_options
    samples_in_order, sample_role_to_train = one_sample_from_flags(
  File "/path1/4_Test/qingjiang/dpv/tmp_dir/Bazel.runfiles_gc9nyr7p/runfiles/com_google_deepvariant/deepvariant/make_examples.py", line 85, in one_sample_from_flags
    sample_name = make_examples_core.assign_sample_name(
  File "/path1/4_Test/qingjiang/dpv/tmp_dir/Bazel.runfiles_gc9nyr7p/runfiles/com_google_deepvariant/deepvariant/make_examples_core.py", line 134, in assign_sample_name
    with sam.SamReader(reads_filenames.split(',')[0]) as sam_reader:
  File "/path1/4_Test/qingjiang/dpv/tmp_dir/Bazel.runfiles_gc9nyr7p/runfiles/com_google_deepvariant/third_party/nucleus/io/genomics_reader.py", line 221, in __init__
    self._reader = self._native_reader(input_path, **kwargs)
  File "/path1/4_Test/qingjiang/dpv/tmp_dir/Bazel.runfiles_gc9nyr7p/runfiles/com_google_deepvariant/third_party/nucleus/io/sam.py", line 260, in _native_reader
    return NativeSamReader(input_path, **kwargs)
  File "/path1/4_Test/qingjiang/dpv/tmp_dir/Bazel.runfiles_gc9nyr7p/runfiles/com_google_deepvariant/third_party/nucleus/io/sam.py", line 227, in __init__
    self._reader = sam_reader.SamReader.from_file(
ValueError: NOT_FOUND: Could not open /mnt/input.bam
parallel: This job failed:
/opt/deepvariant/bin/make_examples --mode calling --ref /mnt/QJref.fa --reads /mnt/input.bam --examples /mnt/dpv/make_examples.tfrecord@3.gz --add_hp_channel --alt_aligned_pileup diff_channels --gvcf /mnt/dpv/gvcf.tfrecord@3.gz --max_reads_per_partition 600 --min_mapping_quality 1 --parse_sam_aux_fields --partition_size 25000 --phase_reads --pileup_image_width 199 --norealign_reads --sort_by_haplotypes --track_ref_reads --vsc_min_fraction_indels 0.12 --task 1

real    0m32.021s
user    0m5.396s
sys     0m4.055s
GodloveD commented 1 year ago

The only thing I can see on first glance is that you are missing a space between the -B and the argument.

singularity run -B$TMPDIR:$TMPDIR,"${WORK_DIR}":/mnt \

Could that be the cause of your troubles?

Zero-Sun commented 1 year ago

Sorry, the missing space was a paste error. In fact, I have used the correct command many times, but this error is still reported.

GodloveD commented 1 year ago

I situations like this where the error makes no sense I find that it is often useful to try to do the same thing in multiple different ways. Sometimes that provides insight. So here, I would suggest 3 things to try.

First change the command /opt/deepvariant/bin/run_deepvariant to ls -l /mnt and see what that turns up.

Second, try removing the -B option argument pair from the script and instead export the APPTAINER_BINDPATH env var to set up the bind mounts.

Third, change your run command to an exec command (since it seems like you really want to be using exec anyway here.)

If you can provide the def file you used to create the container and also tell us what version of singularity/apptainer you are running, that might be useful to us as we try to figure out how to debug. Thanks!

Zero-Sun commented 1 year ago

Thanks again for your guidance, but it didn't work, looking forward to your further help. I installed deepvariant like this.

singularity pull docker://google/deepvariant:"1.4.0"

My environment configuration is as follows.

singularity --version
singularity version 3.5.3-1.1.el7

export TMPDIR="$PWD/tmp_dir"
export SINGULARITY_BIND="/path1/4_Test/qingjiang/dpv:/mnt"

Shows the correct directory.

singularity exec /path2/1_Software/dpv/deepvariant_1.4.0.sif ls -l /mnt
QJref.fa      input.bam      QJref.fa.fai    input.bam.bai        tmp_dir

When I run the command, I still get the same error as last time.

singularity exec /path2/1_Software/dpv/deepvariant_1.4.0.sif /opt/deepvariant/bin/run_deepvariant  \
--model_type=PACBIO \
--ref=/mnt/QJref.fa  \
--reads=/mnt/input.bam  \
--output_vcf=/mnt/output.vcf.gz \
--intermediate_results_dir $TMPDIR/intermediate_results_dir
[E::hts_open_format] Failed to open file "/mnt/input.bam" : No such file or directory
GodloveD commented 1 year ago

Perhaps there is a bug with deepvariant that prevents it from parsing paths properly? Or perhaps the error message is misleading and the issue is actually that the file type is incorrect or corrupted or something? You could try cd-ing to the /mnt directory and calling all commands from there. You could do so like this:

singularity exec /path2/1_Software/dpv/deepvariant_1.4.0.sif sh -c "cd /mnt ; /opt/deepvariant/bin/run_deepvariant  \
--model_type=PACBIO \
--ref=QJref.fa  \
--reads=input.bam  \
--output_vcf=output.vcf.gz \
--intermediate_results_dir $TMPDIR/intermediate_results_dir"

I will also note that you are running a version of singularity that is 3 years old (and contains known security flaws). You should really (really) update. I would suggest updating to the latest version of Apptainer and then trying your command again. Maybe it will correct the issue.

Zero-Sun commented 1 year ago

It doesn't work. Unfortunately, I also cannot ask the cluster manager to update singularity. I still thank you so much for all your help!