ncbi / fcs

Foreign Contamination Screening caller scripts and documentation
Other
88 stars 12 forks source link

gzip: stdin: No data available #80

Open eeaunin opened 1 month ago

eeaunin commented 1 month ago

Hello. Below is a log from an FCS-GX run that crashed with the message gzip: stdin: No data available. What has happened here, and how to prevent this problem?

=============================================================================== 
Source:      /mft-volume 
Destination: /app/db/gxdb 
Resuming failed transfer in /app/db/gxdb... 
Space check: Available:1.14TiB; Existing:0B; Incoming:464.34GiB; Delta:464.34GiB

Requires transfer: 59B all.meta.jsonl 
Copying /mft-volume/all.meta.jsonl to /app/db/gxdb/all.meta.jsonl.part... 

Requires transfer: 187B all.README.txt 
Copying /mft-volume/all.README.txt to /app/db/gxdb/all.README.txt.part... 

Requires transfer: 6.09MiB all.taxa.tsv 
Copying /mft-volume/all.taxa.tsv to /app/db/gxdb/all.taxa.tsv.part... 

Requires transfer: 7.86MiB all.blast_div.tsv.gz 
Copying /mft-volume/all.blast_div.tsv.gz to /app/db/gxdb/all.blast_div.tsv.gz.part... 

Requires transfer: 8.48MiB all.assemblies.tsv 
Copying /mft-volume/all.assemblies.tsv to /app/db/gxdb/all.assemblies.tsv.part... 

Requires transfer: 21.51MiB all.seq_info.tsv.gz 
Copying /mft-volume/all.seq_info.tsv.gz to /app/db/gxdb/all.seq_info.tsv.gz.part... 

Requires transfer: 165.14GiB all.gxs 
Copying /mft-volume/all.gxs to /app/db/gxdb/all.gxs.part... 

Requires transfer: 299.16GiB all.gxi 
Copying /mft-volume/all.gxi to /app/db/gxdb/all.gxi.part... 
Done. 
-----------------------------------------------------------------------------

tax-id    : 476027
fasta     : /sample-volume/assembly.fasta
size      : 2495.09 MiB
split-fa  : True
####### Starting process ['zcat', '-f', '/app/db/gxdb/gx_mapper_2955715/all.blast_div.tsv.gz']
####### Starting process ['grep', '-E', '^476027\t']
####### Cleaning up process ['zcat', '-f', '/app/db/gxdb/gx_mapper_2955715/all.blast_div.tsv.gz']
####### Cleaning up process ['grep', '-E', '^476027\t']
BLAST-div : sponges
gx-div    : anml:basal metazoans
w/same-tax: True
bin-dir   : /app/bin
gx-db     : /app/db/gxdb/gx_mapper_2955715/all.gxi
gx-ver    : Nov 27 2023 11:05:36; git:v0.5.0+branch--HEAD
output    : /output-volume//assembly.476027.taxonomy.rpt

-----------------------------------------------------------------------------

####### args: Namespace(fasta='/sample-volume/assembly.fasta', tax_id=476027, species=None, split_fasta=True, div='anml:basal metazoans', gx_db='/app/db/gxdb/gx_mapper_2955715/all.gxi', mask_transposons=None, bin_dir='/app/bin', allow_same_species=True, ignore_same_kingdom=False, out_basename='/output-volume//assembly.476027', out_dir='/output-volume/', action_report=True, save_hits=False, generate_logfile=False, debug=True, phone_home_label=None, gc_acc=None, gc_genomes_root_dir=None, production_build_name=None, gzip_c='gzip -c', out_taxonomy_rpt='/output-volume//assembly.476027.taxonomy.rpt') 

####### Starting process ['cat', '/sample-volume/assembly.fasta']
####### Starting process ['gzip', '-cdf']
####### Starting process ['/app/bin/gx', 'split-fasta']
####### Starting process ['pv', '-Wbratpe', '--interval=0.5', '--size=2616292917']
####### Starting process ['cat', '/sample-volume/assembly.fasta']
####### Starting process ['gzip', '-cdf']
####### Starting process ['/busybox/time', '-v', 'nice', '-n19', '/app/bin/gx', 'align', '--gx-db=/app/db/gxdb/gx_mapper_2955715/all.gxi', '--repeats-basis-fa=/dev/fd/5']
####### Starting process ['/app/bin/gx', 'taxify', '--gx-db=/app/db/gxdb/gx_mapper_2955715/all.gxi', '--output=/output-volume//assembly.476027.taxonomy.rpt.tmp', '--asserted-div=anml:basal metazoans', '--db-exclude-locs=/app/bin/db_exclude.locs.tsv']
Using GX_PREFETCH=0
Collecting masking statistics...
Collected masking stats:  2.58323 Gbp; 30.8123s; 83.8376 Mbp/s. Baseline: 3.34072

gzip: stdin: No data available
####### Cleaning up process ['cat', '/sample-volume/assembly.fasta']
Error: Process failed with retcode -13: ['cat', '/sample-volume/assembly.fasta'])
####### Cleaning up process ['gzip', '-cdf']
Error: Process failed with retcode 1: ['gzip', '-cdf'])
####### Cleaning up process ['/app/bin/gx', 'split-fasta']
####### Cleaning up process ['pv', '-Wbratpe', '--interval=0.5', '--size=2616292917']
####### Cleaning up process ['/busybox/time', '-v', 'nice', '-n19', '/app/bin/gx', 'align', '--gx-db=/app/db/gxdb/gx_mapper_2955715/all.gxi', '--repeats-basis-fa=/dev/fd/5']
####### Cleaning up process ['/app/bin/gx', 'taxify', '--gx-db=/app/db/gxdb/gx_mapper_2955715/all.gxi', '--output=/output-volume//assembly.476027.taxonomy.rpt.tmp', '--asserted-div=anml:basal metazoans', '--db-exclude-locs=/app/bin/db_exclude.locs.tsv']
####### Cleaning up process ['cat', '/sample-volume/assembly.fasta']
####### Cleaning up process ['gzip', '-cdf']

-----------------------------------------------------------------------------

Traceback (most recent call last):
  File "/tmp/Bazel.runfiles_rs4oazym/runfiles/cgr_fcs/apps/fcs_genome/public/run_gx/run_gx.py", line 1037, in <module>
    main()
  File "/tmp/Bazel.runfiles_rs4oazym/runfiles/cgr_fcs/apps/fcs_genome/public/run_gx/run_gx.py", line 1012, in main
    run_gx_pipeline(args)
  File "/tmp/Bazel.runfiles_rs4oazym/runfiles/cgr_fcs/apps/fcs_genome/public/run_gx/run_gx.py", line 678, in run_gx_pipeline
    run(p_zcat_fasta, p_save_hits, p_main)
  File "/tmp/Bazel.runfiles_rs4oazym/runfiles/cgr_fcs/apps/fcs_genome/public/run_gx/run_gx.py", line 278, in __exit__
    self.wait()
  File "/tmp/Bazel.runfiles_rs4oazym/runfiles/cgr_fcs/apps/fcs_genome/public/run_gx/run_gx.py", line 268, in wait
    assert num_errors == 0, "Had errors."

These are the software versions used for this run: OS: Ubuntu 22.04.4 LTS Singularity: v3.11.4 FCS image: 0.5.0 Python: 3.8.12 Platform: LSF

LaurenHuet commented 1 month ago

Following this as i'm having the same error.

jt8-sanger commented 1 month ago

Expanding slightly on @eeaunin 's comment above- this is currently a major problem for us at the Sanger. We've started seeing it after moving to a new farm cluster- it didn't seem to occur at all on the old cluster, but on the new one it happens perhaps half the time. Rerunning the same jobs often works the second time, so it seems to be an (apparently) randomly occurring intermittent fault- it doesn't seem like there's anything unusual about the input FASTA files. It happens for small assemblies and large ones; sometimes FCS-GX seems to be running for a while before this problem strikes.

etvedte commented 1 month ago

We are currently attempting to reproduce this issue locally, and would appreciate any information that you think could be pertinent.

@eeaunin do you have access to Docker, or just Singularity? If you do have both, can you test using Docker? @LaurenHuet @jt8-sanger - can you please provide more information about your run environment (OS, FCS image type, image version, job scheduler)

eeaunin commented 1 month ago

Hello, thanks for replying. I am using FCS-GX with uncompressed (not gzipped) assembly FASTA files. The problem appears to be intermittent: a retry of a crashed run with the same input files and same settings can seemingly randomly succeed or fail again. I'll investigate this more with multiple replicates. I haven't tried copying the genome FASTA file to /tmp. I have so far only run FCS-GX with Singularity. I don't know if I have access to Docker on the LSF. I'll find out if I have or not

LaurenHuet commented 1 month ago

I am using this with unzipped fasta files using singularity. I have had the same error 3 times in a row with the fasta. I have pulled the latest singularity container from the NCBI git. I am using SLURM on HPC (Pawsey super computer) I ran this across a batch of 60 genomes however only one is receiving this error. It seems to run to about 99% then fails with this. I have checked the fasta file for any issues, it is okay, it is a small genome.

`Prefetching /app/db/gxdb/gxdb/all.gxi 96%...                         
Prefetching /app/db/gxdb/gxdb/all.gxi 97%...                         
Prefetching /app/db/gxdb/gxdb/all.gxi 98%...                         
Prefetching /app/db/gxdb/gxdb/all.gxi 99%...                         
Prefetched /app/db/gxdb/gxdb/all.gxi in 1254.08s; 0.256136 GB/s. The file is 93% in RAM.
Collecting masking statistics...

gzip: stdin: No data available
Collected masking stats:  9.1081e-05 Gbp; 0.359757s; 0.253167 Mbp/s. Baseline: 1.02634

gzip: stdin: No data available
Error: Process failed with retcode -13: ['cat', '/sample-volume/OG235.ilmn.230324.v129mh.fasta'])
Error: Process failed with retcode 1: ['gzip', '-cdf'])
Error: Process failed with retcode -13: ['cat', '/sample-volume/OG235.ilmn.230324.v129mh.fasta'])
Error: Process failed with retcode 1: ['gzip', '-cdf'])

-----------------------------------------------------------------------------

Traceback (most recent call last):
  File "/tmp/Bazel.runfiles_lwww7goj/runfiles/cgr_fcs/apps/fcs_genome/public/run_gx/run_gx.py", line 1037, in <module>
    main()
  File "/tmp/Bazel.runfiles_lwww7goj/runfiles/cgr_fcs/apps/fcs_genome/public/run_gx/run_gx.py", line 1012, in main
    run_gx_pipeline(args)
  File "/tmp/Bazel.runfiles_lwww7goj/runfiles/cgr_fcs/apps/fcs_genome/public/run_gx/run_gx.py", line 678, in run_gx_pipeline
    run(p_zcat_fasta, p_save_hits, p_main)
  File "/tmp/Bazel.runfiles_lwww7goj/runfiles/cgr_fcs/apps/fcs_genome/public/run_gx/run_gx.py", line 278, in __exit__
    self.wait()
  File "/tmp/Bazel.runfiles_lwww7goj/runfiles/cgr_fcs/apps/fcs_genome/public/run_gx/run_gx.py", line 268, in wait
    assert num_errors == 0, "Had errors."
AssertionError: Had errors.
Traceback (most recent call last):
  File "/software/projects/pawsey0812/singularity/fcs.py", line 445, in <module>
    sys.exit(main())
  File "/software/projects/pawsey0812/singularity/fcs.py", line 434, in main
    gx.run()
  File "/software/projects/pawsey0812/singularity/fcs.py", line 345, in run
    self.args.func(self)
  File "/software/projects/pawsey0812/singularity/fcs.py", line 323, in run_screen_mode
    self.run_gx()
  File "/software/projects/pawsey0812/singularity/fcs.py", line 241, in run_gx
    self.safe_exec(docker_args)
  File "/software/projects/pawsey0812/singularity/fcs.py", line 166, in safe_exec
    subprocess.run(args, shell=False, check=True, text=True, stdout=sys.stdout, stderr=sys.stderr)
  File "/software/projects/pawsey0812/singularity/miniconda3/lib/python3.10/subprocess.py", line 526, in run
    raise CalledProcessError(retcode, process.args,
subprocess.CalledProcessError: Command '['singularity', 'exec', '--bind', '/scratch/pawsey0812/lhuet/NCBI:/app/db/gxdb/', '--bind', '/scratch/pawsey0812/lhuet/NOVA_230324_AD1/OG235/assemblies/genome:/sample-volume/', '--bind', '/scratch/pawsey0812/lhuet/NOVA_230324_AD1/OG235/assemblies/genome/NCBI:/output-volume/', '/software/projects/pawsey0812/singularity/fcs-gx.sif', 'python3', '/app/bin/run_gx', '--fasta', '/sample-volume/OG235.ilmn.230324.v129mh.fasta', '--out-dir', '/output-volume/', '--gx-db', '/app/db/gxdb/gxdb', '--tax-id', '7898']' returned non-zero exit status 1.`
jt8-sanger commented 1 month ago

@etvedte: Regarding your question about my run environment- I'm working with @eeaunin on this, so the answers are the same as the ones he gives above.

etvedte commented 1 month ago

We noticed that the Apptainer (formerly Singularity) changelog mentions addressing "no data available" errors, which could be related to the issue you are observing.

https://github.com/apptainer/apptainer/blob/main/CHANGELOG.md#other-changes

Would it be possible for you attempt to reproduce the issue using the latest version of Apptainer?

eeaunin commented 1 month ago

I have now tried multiple replicates. I used a Plasmodium chabaudi chabaudi assembly FASTA file as the input. The Singularity version that I was using was singularity-ce 3.11.4. Yesterday I did 10 runs with the same assembly FASTA file and same settings and all 10 completed successfully. Today I did another 10 runs with the same files and settings. 3 out of 10 crashed with the gzip: stdin: No data available error

eeaunin commented 1 month ago

Here are some things in response to the questions from a few days ago:

do you have access to Docker, or just Singularity? If you do have both, can you test using Docker?

I don't have proper access to Docker on the compute farm that I am using. There is a limited installation of Docker that doesn't allow writing results to disk. For production purposes I have to use Singularity.

Is it reproducible if the genome fasta file is initially copied to /tmp on the host, and then used as input?

I have now tested running FCS-GX with and without copying the assembly FASTA file to /tmp but for some reason, the same error hasn't reappeared in the past 4 days. I ran FCS-GX in with the same Plasmodium chabaudi chabaudi assembly file that I mentioned before, 80 runs with the assembly FASTA copied to /tmp before the run and 80 runs without copying the assembly FASTA to /tmp. There were no crashes in either set of runs. I still have no idea what determines if the crashes with the gzip: stdin: No data available error happen or not.

Would it be possible for you attempt to reproduce the issue using the latest version of Apptainer?

The installation of Apptainer has been requested from the IT service desk but they haven't installed it yet

etvedte commented 1 month ago

That's good to hear.

We are also working on a new patch release that may or may not help with this issue. I'll keep you posted when that's available.

LaurenHuet commented 3 weeks ago

Hello, I have version 4.1.0 of singularity and have pulled the latest version of the FCS-GX from the git page. I am using slurm job scheduler on Pawsey.

I have ran this 6 times across 35 genome assemblies, 9 of them have completed successfully, the rest continually error out with gzip: stdin: No data available.

When I first posted about this error, I was running it across 24 (different) genome assemblies with only 2 receiving this error.

I have seen that I am getting more errors with Illumina data vs Pacbio data.

`-----------------------------------------------------------------------------

tax-id : 7898 fasta : /sample-volume/OG193.ilmn.240313.v129mh.fasta size : 795.28 MiB split-fa : True ####### Starting process ['zcat', '-f', '/app/db/gxdb/gxdb/all.blast_div.tsv.gz'] ####### Starting process ['grep', '-E', '^7898\t'] ####### Cleaning up process ['zcat', '-f', '/app/db/gxdb/gxdb/all.blast_div.tsv.gz'] ####### Cleaning up process ['grep', '-E', '^7898\t'] BLAST-div : bony fishes gx-div : anml:fishes w/same-tax: True bin-dir : /app/bin gx-db : /app/db/gxdb/gxdb/all.gxi gx-ver : Nov 27 2023 11:05:36; git:v0.5.0+branch--HEAD output : /output-volume//OG193.ilmn.240313.v129mh.7898.taxonomy.rpt


####### args: Namespace(fasta='/sample-volume/OG193.ilmn.240313.v129mh.fasta', tax_id=7898, species=None, split_fasta=True, div='anml:fishes', gx_db='/app/db/gxdb/gxdb/all.gxi', mask_transposons=None, bin_dir='/app/bin', allow_same_species=True, ignore_same_kingdom=False, out_basename='/output-volume//OG193.ilmn.240313.v129mh.7898', out_dir='/output-volume/', action_report=True, save_hits=False, generate_logfile=False, debug=True, phone_home_label=None, gc_acc=None, gc_genomes_root_dir=None, production_build_name=None, gzip_c='gzip -c', out_taxonomy_rpt='/output-volume//OG193.ilmn.240313.v129mh.7898.taxonomy.rpt')

####### Starting process ['cat', '/sample-volume/OG193.ilmn.240313.v129mh.fasta'] ####### Starting process ['gzip', '-cdf'] ####### Starting process ['/app/bin/gx', 'split-fasta'] ####### Starting process ['pv', '-Wbratpe', '--interval=0.5', '--size=833913207'] ####### Starting process ['cat', '/sample-volume/OG193.ilmn.240313.v129mh.fasta'] ####### Starting process ['gzip', '-cdf'] ####### Starting process ['/busybox/time', '-v', 'nice', '-n19', '/app/bin/gx', 'align', '--gx-db=/app/db/gxdb/gxdb/all.gxi', '--repeats-basis-fa=/dev/fd/5'] ####### Starting process ['/app/bin/gx', 'taxify', '--gx-db=/app/db/gxdb/gxdb/all.gxi', '--output=/output-volume//OG193.ilmn.240313.v129mh.7898.taxonomy.rpt.tmp', '--asserted-div=anml:fishes', '--db-exclude-locs=/app/bin/db_exclude.locs.tsv'] Collecting masking statistics... Collected masking stats: 0.825914 Gbp; 9.98127s; 82.7463 Mbp/s. Baseline: 1.77974

gzip: stdin: No data available ####### Cleaning up process ['cat', '/sample-volume/OG193.ilmn.240313.v129mh.fasta'] Error: Process failed with retcode -13: ['cat', '/sample-volume/OG193.ilmn.240313.v129mh.fasta']) ####### Cleaning up process ['gzip', '-cdf'] Error: Process failed with retcode 1: ['gzip', '-cdf']) ####### Cleaning up process ['/app/bin/gx', 'split-fasta'] ####### Cleaning up process ['pv', '-Wbratpe', '--interval=0.5', '--size=833913207'] ####### Cleaning up process ['/busybox/time', '-v', 'nice', '-n19', '/app/bin/gx', 'align', '--gx-db=/app/db/gxdb/gxdb/all.gxi', '--repeats-basis-fa=/dev/fd/5'] ####### Cleaning up process ['/app/bin/gx', 'taxify', '--gx-db=/app/db/gxdb/gxdb/all.gxi', '--output=/output-volume//OG193.ilmn.240313.v129mh.7898.taxonomy.rpt.tmp', '--asserted-div=anml:fishes', '--db-exclude-locs=/app/bin/db_exclude.locs.tsv'] ####### Cleaning up process ['cat', '/sample-volume/OG193.ilmn.240313.v129mh.fasta'] ####### Cleaning up process ['gzip', '-cdf']


Traceback (most recent call last): File "/tmp/Bazel.runfiles_qu08_ukq/runfiles/cgr_fcs/apps/fcs_genome/public/run_gx/run_gx.py", line 1037, in main() File "/tmp/Bazel.runfiles_qu08_ukq/runfiles/cgr_fcs/apps/fcs_genome/public/run_gx/run_gx.py", line 1012, in main run_gx_pipeline(args) File "/tmp/Bazel.runfiles_qu08_ukq/runfiles/cgr_fcs/apps/fcs_genome/public/run_gx/run_gx.py", line 678, in run_gx_pipeline run(p_zcat_fasta, p_save_hits, p_main) File "/tmp/Bazel.runfiles_qu08_ukq/runfiles/cgr_fcs/apps/fcs_genome/public/run_gx/run_gx.py", line 278, in exit self.wait() File "/tmp/Bazel.runfiles_qu08_ukq/runfiles/cgr_fcs/apps/fcs_genome/public/run_gx/run_gx.py", line 268, in wait assert num_errors == 0, "Had errors." AssertionError: Had errors. Traceback (most recent call last): File "/software/projects/pawsey0812/singularity/fcs.py", line 445, in sys.exit(main()) ^^^^^^ File "/software/projects/pawsey0812/singularity/fcs.py", line 434, in main gx.run() File "/software/projects/pawsey0812/singularity/fcs.py", line 345, in run self.args.func(self) File "/software/projects/pawsey0812/singularity/fcs.py", line 323, in run_screen_mode self.run_gx() File "/software/projects/pawsey0812/singularity/fcs.py", line 241, in run_gx self.safe_exec(docker_args) File "/software/projects/pawsey0812/singularity/fcs.py", line 166, in safe_exec subprocess.run(args, shell=False, check=True, text=True, stdout=sys.stdout, stderr=sys.stderr) File "/software/setonix/2024.05/software/linux-sles15-zen3/gcc-12.2.0/python-3.11.6-4ysxrvuaor6iljintmzcazlkfcokwnes/lib/python3.11/subprocess.py", line 571, in run raise CalledProcessError(retcode, process.args, subprocess.CalledProcessError: Command '`

etvedte commented 1 week ago

We have a new FCS v0.5.4 release that may resolve this gzip: stdin: No data available issue. Can you update the version you are using and re-test?