Open eeaunin opened 1 month ago
Following this as i'm having the same error.
Expanding slightly on @eeaunin 's comment above- this is currently a major problem for us at the Sanger. We've started seeing it after moving to a new farm cluster- it didn't seem to occur at all on the old cluster, but on the new one it happens perhaps half the time. Rerunning the same jobs often works the second time, so it seems to be an (apparently) randomly occurring intermittent fault- it doesn't seem like there's anything unusual about the input FASTA files. It happens for small assemblies and large ones; sometimes FCS-GX seems to be running for a while before this problem strikes.
We are currently attempting to reproduce this issue locally, and would appreciate any information that you think could be pertinent.
/tmp
on the host, and then used as input?@eeaunin do you have access to Docker, or just Singularity? If you do have both, can you test using Docker? @LaurenHuet @jt8-sanger - can you please provide more information about your run environment (OS, FCS image type, image version, job scheduler)
Hello, thanks for replying. I am using FCS-GX with uncompressed (not gzipped) assembly FASTA files. The problem appears to be intermittent: a retry of a crashed run with the same input files and same settings can seemingly randomly succeed or fail again. I'll investigate this more with multiple replicates. I haven't tried copying the genome FASTA file to /tmp
. I have so far only run FCS-GX with Singularity. I don't know if I have access to Docker on the LSF. I'll find out if I have or not
I am using this with unzipped fasta files using singularity. I have had the same error 3 times in a row with the fasta. I have pulled the latest singularity container from the NCBI git. I am using SLURM on HPC (Pawsey super computer) I ran this across a batch of 60 genomes however only one is receiving this error. It seems to run to about 99% then fails with this. I have checked the fasta file for any issues, it is okay, it is a small genome.
`Prefetching /app/db/gxdb/gxdb/all.gxi 96%...
Prefetching /app/db/gxdb/gxdb/all.gxi 97%...
Prefetching /app/db/gxdb/gxdb/all.gxi 98%...
Prefetching /app/db/gxdb/gxdb/all.gxi 99%...
Prefetched /app/db/gxdb/gxdb/all.gxi in 1254.08s; 0.256136 GB/s. The file is 93% in RAM.
Collecting masking statistics...
gzip: stdin: No data available
Collected masking stats: 9.1081e-05 Gbp; 0.359757s; 0.253167 Mbp/s. Baseline: 1.02634
gzip: stdin: No data available
Error: Process failed with retcode -13: ['cat', '/sample-volume/OG235.ilmn.230324.v129mh.fasta'])
Error: Process failed with retcode 1: ['gzip', '-cdf'])
Error: Process failed with retcode -13: ['cat', '/sample-volume/OG235.ilmn.230324.v129mh.fasta'])
Error: Process failed with retcode 1: ['gzip', '-cdf'])
-----------------------------------------------------------------------------
Traceback (most recent call last):
File "/tmp/Bazel.runfiles_lwww7goj/runfiles/cgr_fcs/apps/fcs_genome/public/run_gx/run_gx.py", line 1037, in <module>
main()
File "/tmp/Bazel.runfiles_lwww7goj/runfiles/cgr_fcs/apps/fcs_genome/public/run_gx/run_gx.py", line 1012, in main
run_gx_pipeline(args)
File "/tmp/Bazel.runfiles_lwww7goj/runfiles/cgr_fcs/apps/fcs_genome/public/run_gx/run_gx.py", line 678, in run_gx_pipeline
run(p_zcat_fasta, p_save_hits, p_main)
File "/tmp/Bazel.runfiles_lwww7goj/runfiles/cgr_fcs/apps/fcs_genome/public/run_gx/run_gx.py", line 278, in __exit__
self.wait()
File "/tmp/Bazel.runfiles_lwww7goj/runfiles/cgr_fcs/apps/fcs_genome/public/run_gx/run_gx.py", line 268, in wait
assert num_errors == 0, "Had errors."
AssertionError: Had errors.
Traceback (most recent call last):
File "/software/projects/pawsey0812/singularity/fcs.py", line 445, in <module>
sys.exit(main())
File "/software/projects/pawsey0812/singularity/fcs.py", line 434, in main
gx.run()
File "/software/projects/pawsey0812/singularity/fcs.py", line 345, in run
self.args.func(self)
File "/software/projects/pawsey0812/singularity/fcs.py", line 323, in run_screen_mode
self.run_gx()
File "/software/projects/pawsey0812/singularity/fcs.py", line 241, in run_gx
self.safe_exec(docker_args)
File "/software/projects/pawsey0812/singularity/fcs.py", line 166, in safe_exec
subprocess.run(args, shell=False, check=True, text=True, stdout=sys.stdout, stderr=sys.stderr)
File "/software/projects/pawsey0812/singularity/miniconda3/lib/python3.10/subprocess.py", line 526, in run
raise CalledProcessError(retcode, process.args,
subprocess.CalledProcessError: Command '['singularity', 'exec', '--bind', '/scratch/pawsey0812/lhuet/NCBI:/app/db/gxdb/', '--bind', '/scratch/pawsey0812/lhuet/NOVA_230324_AD1/OG235/assemblies/genome:/sample-volume/', '--bind', '/scratch/pawsey0812/lhuet/NOVA_230324_AD1/OG235/assemblies/genome/NCBI:/output-volume/', '/software/projects/pawsey0812/singularity/fcs-gx.sif', 'python3', '/app/bin/run_gx', '--fasta', '/sample-volume/OG235.ilmn.230324.v129mh.fasta', '--out-dir', '/output-volume/', '--gx-db', '/app/db/gxdb/gxdb', '--tax-id', '7898']' returned non-zero exit status 1.`
@etvedte: Regarding your question about my run environment- I'm working with @eeaunin on this, so the answers are the same as the ones he gives above.
We noticed that the Apptainer (formerly Singularity) changelog mentions addressing "no data available" errors, which could be related to the issue you are observing.
https://github.com/apptainer/apptainer/blob/main/CHANGELOG.md#other-changes
Would it be possible for you attempt to reproduce the issue using the latest version of Apptainer?
I have now tried multiple replicates. I used a Plasmodium chabaudi chabaudi assembly FASTA file as the input. The Singularity version that I was using was singularity-ce 3.11.4
. Yesterday I did 10 runs with the same assembly FASTA file and same settings and all 10 completed successfully. Today I did another 10 runs with the same files and settings. 3 out of 10 crashed with the gzip: stdin: No data available
error
Here are some things in response to the questions from a few days ago:
do you have access to Docker, or just Singularity? If you do have both, can you test using Docker?
I don't have proper access to Docker on the compute farm that I am using. There is a limited installation of Docker that doesn't allow writing results to disk. For production purposes I have to use Singularity.
Is it reproducible if the genome fasta file is initially copied to
/tmp
on the host, and then used as input?
I have now tested running FCS-GX with and without copying the assembly FASTA file to /tmp
but for some reason, the same error hasn't reappeared in the past 4 days. I ran FCS-GX in with the same Plasmodium chabaudi chabaudi assembly file that I mentioned before, 80 runs with the assembly FASTA copied to /tmp
before the run and 80 runs without copying the assembly FASTA to /tmp
. There were no crashes in either set of runs. I still have no idea what determines if the crashes with the gzip: stdin: No data available
error happen or not.
Would it be possible for you attempt to reproduce the issue using the latest version of Apptainer?
The installation of Apptainer has been requested from the IT service desk but they haven't installed it yet
That's good to hear.
We are also working on a new patch release that may or may not help with this issue. I'll keep you posted when that's available.
Hello, I have version 4.1.0 of singularity and have pulled the latest version of the FCS-GX from the git page. I am using slurm job scheduler on Pawsey.
I have ran this 6 times across 35 genome assemblies, 9 of them have completed successfully, the rest continually error out with gzip: stdin: No data available
.
When I first posted about this error, I was running it across 24 (different) genome assemblies with only 2 receiving this error.
I have seen that I am getting more errors with Illumina data vs Pacbio data.
`-----------------------------------------------------------------------------
tax-id : 7898 fasta : /sample-volume/OG193.ilmn.240313.v129mh.fasta size : 795.28 MiB split-fa : True ####### Starting process ['zcat', '-f', '/app/db/gxdb/gxdb/all.blast_div.tsv.gz'] ####### Starting process ['grep', '-E', '^7898\t'] ####### Cleaning up process ['zcat', '-f', '/app/db/gxdb/gxdb/all.blast_div.tsv.gz'] ####### Cleaning up process ['grep', '-E', '^7898\t'] BLAST-div : bony fishes gx-div : anml:fishes w/same-tax: True bin-dir : /app/bin gx-db : /app/db/gxdb/gxdb/all.gxi gx-ver : Nov 27 2023 11:05:36; git:v0.5.0+branch--HEAD output : /output-volume//OG193.ilmn.240313.v129mh.7898.taxonomy.rpt
####### args: Namespace(fasta='/sample-volume/OG193.ilmn.240313.v129mh.fasta', tax_id=7898, species=None, split_fasta=True, div='anml:fishes', gx_db='/app/db/gxdb/gxdb/all.gxi', mask_transposons=None, bin_dir='/app/bin', allow_same_species=True, ignore_same_kingdom=False, out_basename='/output-volume//OG193.ilmn.240313.v129mh.7898', out_dir='/output-volume/', action_report=True, save_hits=False, generate_logfile=False, debug=True, phone_home_label=None, gc_acc=None, gc_genomes_root_dir=None, production_build_name=None, gzip_c='gzip -c', out_taxonomy_rpt='/output-volume//OG193.ilmn.240313.v129mh.7898.taxonomy.rpt')
####### Starting process ['cat', '/sample-volume/OG193.ilmn.240313.v129mh.fasta'] ####### Starting process ['gzip', '-cdf'] ####### Starting process ['/app/bin/gx', 'split-fasta'] ####### Starting process ['pv', '-Wbratpe', '--interval=0.5', '--size=833913207'] ####### Starting process ['cat', '/sample-volume/OG193.ilmn.240313.v129mh.fasta'] ####### Starting process ['gzip', '-cdf'] ####### Starting process ['/busybox/time', '-v', 'nice', '-n19', '/app/bin/gx', 'align', '--gx-db=/app/db/gxdb/gxdb/all.gxi', '--repeats-basis-fa=/dev/fd/5'] ####### Starting process ['/app/bin/gx', 'taxify', '--gx-db=/app/db/gxdb/gxdb/all.gxi', '--output=/output-volume//OG193.ilmn.240313.v129mh.7898.taxonomy.rpt.tmp', '--asserted-div=anml:fishes', '--db-exclude-locs=/app/bin/db_exclude.locs.tsv'] Collecting masking statistics... Collected masking stats: 0.825914 Gbp; 9.98127s; 82.7463 Mbp/s. Baseline: 1.77974
gzip: stdin: No data available ####### Cleaning up process ['cat', '/sample-volume/OG193.ilmn.240313.v129mh.fasta'] Error: Process failed with retcode -13: ['cat', '/sample-volume/OG193.ilmn.240313.v129mh.fasta']) ####### Cleaning up process ['gzip', '-cdf'] Error: Process failed with retcode 1: ['gzip', '-cdf']) ####### Cleaning up process ['/app/bin/gx', 'split-fasta'] ####### Cleaning up process ['pv', '-Wbratpe', '--interval=0.5', '--size=833913207'] ####### Cleaning up process ['/busybox/time', '-v', 'nice', '-n19', '/app/bin/gx', 'align', '--gx-db=/app/db/gxdb/gxdb/all.gxi', '--repeats-basis-fa=/dev/fd/5'] ####### Cleaning up process ['/app/bin/gx', 'taxify', '--gx-db=/app/db/gxdb/gxdb/all.gxi', '--output=/output-volume//OG193.ilmn.240313.v129mh.7898.taxonomy.rpt.tmp', '--asserted-div=anml:fishes', '--db-exclude-locs=/app/bin/db_exclude.locs.tsv'] ####### Cleaning up process ['cat', '/sample-volume/OG193.ilmn.240313.v129mh.fasta'] ####### Cleaning up process ['gzip', '-cdf']
Traceback (most recent call last):
File "/tmp/Bazel.runfiles_qu08_ukq/runfiles/cgr_fcs/apps/fcs_genome/public/run_gx/run_gx.py", line 1037, in
We have a new FCS v0.5.4 release that may resolve this gzip: stdin: No data available
issue. Can you update the version you are using and re-test?
Hello. Below is a log from an FCS-GX run that crashed with the message
gzip: stdin: No data available
. What has happened here, and how to prevent this problem?These are the software versions used for this run: OS: Ubuntu 22.04.4 LTS Singularity: v3.11.4 FCS image: 0.5.0 Python: 3.8.12 Platform: LSF