ncbi / fcs

Foreign Contamination Screening caller scripts and documentation
Other
100 stars 13 forks source link

AssertionError: Integrity check failed #79

Closed eeaunin closed 2 months ago

eeaunin commented 4 months ago

Hello. Half of my FCS-GX runs still end in crashes, even after assigning FCS-GX more than 470 Gb memory in every run (which was suggested in issue #69). The crashes seem to happen randomly: a resubmitted run with exactly the same input files and settings may or may not crash again on second try. The error messages are different from what I reported in issue #69. One of them is AssertionError: Integrity check failed. A log of a run with this error is below:

=============================================================================== 
Source:      /mft-volume 
Destination: /app/db/gxdb 
Resuming failed transfer in /app/db/gxdb... 
Space check: Available:1.14TiB; Existing:0B; Incoming:464.34GiB; Delta:464.34GiB

Requires transfer: 59B all.meta.jsonl 
Copying /mft-volume/all.meta.jsonl to /app/db/gxdb/all.meta.jsonl.part... 

Requires transfer: 187B all.README.txt 
Copying /mft-volume/all.README.txt to /app/db/gxdb/all.README.txt.part... 

Requires transfer: 6.09MiB all.taxa.tsv 
Copying /mft-volume/all.taxa.tsv to /app/db/gxdb/all.taxa.tsv.part... 

Requires transfer: 7.86MiB all.blast_div.tsv.gz 
Copying /mft-volume/all.blast_div.tsv.gz to /app/db/gxdb/all.blast_div.tsv.gz.part... 

Requires transfer: 8.48MiB all.assemblies.tsv 
Copying /mft-volume/all.assemblies.tsv to /app/db/gxdb/all.assemblies.tsv.part... 

Requires transfer: 21.51MiB all.seq_info.tsv.gz 
Copying /mft-volume/all.seq_info.tsv.gz to /app/db/gxdb/all.seq_info.tsv.gz.part... 

Requires transfer: 165.14GiB all.gxs 
Copying /mft-volume/all.gxs to /app/db/gxdb/all.gxs.part... 
/app/db/gxdb/all.gxs.part - file-size changed. 
Traceback (most recent call last):
  File "/tmp/Bazel.runfiles_26ygs8hq/runfiles/cgr_fcs/apps/fcs_genome/public/sync_files/sync_files.py", line 724, in <module>
    main()
  File "/tmp/Bazel.runfiles_26ygs8hq/runfiles/cgr_fcs/apps/fcs_genome/public/sync_files/sync_files.py", line 700, in main
    transfer_file(mi, src_mft_dir, work_dir)
  File "/tmp/Bazel.runfiles_26ygs8hq/runfiles/cgr_fcs/apps/fcs_genome/public/sync_files/sync_files.py", line 572, in transfer_file
    assert file_integrity_ok(mi, tmp_file_path, verify_hashes=(not hash_ok), verbose=True), "Integrity check failed."
AssertionError: Integrity check failed.
-----------------------------------------------------------------------------

Traceback (most recent call last):
  File "/tmp/Bazel.runfiles_nwibowqx/runfiles/cgr_fcs/apps/fcs_genome/public/run_gx/run_gx.py", line 1037, in <module>
    main()
  File "/tmp/Bazel.runfiles_nwibowqx/runfiles/cgr_fcs/apps/fcs_genome/public/run_gx/run_gx.py", line 1004, in main
    assert len(paths) == 1, f"Cannot resolve path to *.gxi file from {args.gx_db}: {paths}"
AssertionError: Cannot resolve path to *.gxi file from /app/db/gxdb/gx_mapper_1530334: []

What has happened here, and how to prevent this crash?

These are the software versions used for this run: OS: Ubuntu 22.04.4 LTS Singularity: v3.10.0 FCS image: 0.5.0 Python: 3.8.12 Platform: LSF

etvedte commented 4 months ago

AssertionError: Integrity check failed. This error indicates that the database files copied to the destination directory are corrupted. What did you do for batch processing in #78 ?

Please verify that the database files you downloaded from source are correct (see db check command). If the problem persists, you may want to try using your preferred method of copying files to the destination directory instead of using fcs.py db get and then verifying the integrity of the transfer with fcs.py db check.

https://github.com/ncbi/fcs/wiki/FCS-GX-input

AssertionError: Cannot resolve path to *.gxi file from /app/db/gxdb/gx_mapper_1530334: [] This error is indicative that screen genome command was invoked with --gx-db= path containing incomplete or corrupted gx-database.

eeaunin commented 4 months ago

Hello. I haven't so far implemented batch processing, although it's still in the plans to try it. In the runs that I'm talking about here I'm using FCS-GX with only one assembly file per run, and the database gets copied to a subdirectory on /tmp for the run using fcs.py db get.

I assume the database downloaded from https://ftp.ncbi.nlm.nih.gov/genomes/TOOLS/FCS/database/latest is fine. The AssertionError: Integrity check failed. error message appears intermittently but if it was caused by a faulty database download from the FTP site it would probably appear every time I run FCS-GX.

The output for the python3 fcs.py db check --mft "$SOURCE_DB_MANIFEST" --dir "$LOCAL_DB" command is:

=============================================================================== 
/app/db/gxdb is up-to-date with https://ncbi-fcs-gx.s3.amazonaws.com/gxdb/latest. 

So I think fcs.py occasionally fails to fully copy the database over to /tmp when the fcs.py db get command is run. I guess I could indeed try to work around it by using some other method of copying files instead of relying on fcs.py db get. I'm not sure what method it should be, though

etvedte commented 2 months ago

Did you manage to resolve this issue? The various db retrieval methods are described here:

https://github.com/ncbi/fcs/wiki/FCS-GX-input#fcs-gx-database-location

eeaunin commented 2 months ago

I think so. About a month ago I replaced the code for copying the database files with my own Python script that copies the files, checks the copied files with md5sum and retries copying if copying failed. I then also runs the fcs.py db check command. I haven't seen crashes with the AssertionError: Integrity check failed. error or any other error in the database copying stage since then

etvedte commented 2 months ago

Thanks. Hope things continue to run smoothly. I know you have another open reported issue so let me know if you get the latest release to run.