bbuchfink / diamond

Accelerated BLAST compatible local sequence aligner.
GNU General Public License v3.0
1.06k stars 182 forks source link

Bus error and input/output error #566

Open kevfly16 opened 2 years ago

kevfly16 commented 2 years ago

Hi,

I'm having an issue with using diamond on a relatively large file size. It usually gets to the closing the output file step and hangs for a couple of hours before exiting with a bus error. When I decrease the block-size to below 16 (this splits my reference database into 2 blocks), it seems to get to the computing alignments step and exits with an input/output error.

I've tried varying --block-size 20, --block-size 16, --block-size 8, --bin 16 and --bin 64 with the same results. From looking at dmesg output, I can confirm I'm not seeing any messages from the kernel that would indicate the process was killed. Smaller files (~1,000 sequences) I've tried work and finish in 10-15 min with most of the time spent on loading reference sequences and building reference histograms.

Appreciate any help. See more info below.

EC2 Instance Specs:

I/O:

diamond blastx \ -q /mnt/s3fs/.../input.1.fasta \ -d /mnt/s3fs/.../blast/UNIREF100 \ -o /mnt/s3fs/.../out \ --evalue 10 \ --threads 16 \ --block-size 200 \ --index-chunks 1 \ --salltitles \ --more-sensitive \ --min-orf 10 \ --masking 0 \ --top 5 \ --bin 64 \ --log \ -f 100 diamond blastx -q /mnt/s3fs/.../input.1.fasta -d /mnt/s3fs/.../blast/UNIREF100 -o /mnt/s3fs/.../out --evalue 10 --threads 16 --block-size 200 --index-chunks 1 --salltitles --more-sensitive --min-orf 10 --masking 0 --top 5 --bin 64 --log -f 100 diamond v2.0.13.151 (C) Max Planck Society for the Advancement of Science Documentation, support and updates available at http://www.diamondsearch.org Please cite: http://dx.doi.org/10.1038/s41592-021-01101-x Nature Methods (2021)

diamond v2.0.13.151

CPU threads: 16

Scoring parameters: (Matrix=BLOSUM62 Lambda=0.267 K=0.041 Penalties=11/1) CPU features detected: ssse3 popcnt sse4.1 avx2 Runtime dispatch: disabled L3 cache size: 67108864 MAX_SHAPE_LEN=19 SEQ_MASK STRICT_BAND Temporary directory: /mnt/s3fs/... Percentage range of top alignment score to report hits: 5 DP fields: 1 Opening the database... [5.787s] Database: /mnt/s3fs/.../blast/UNIREF100 (type: BLAST database, sequences: 34281384, letters: 15925806162) Block size = 200000000000 Opening the input file... [0.172s] Opening the output file... [0.09s] Loading query sequences... [0.081s] Sequences = 348120, letters = 5569920, average length = 16 Current RSS: 35.4 MB, Peak RSS: 35.6 MB Building query seed set... Shape=0 Hash_table_size=4194304 load=0.423743 Shape=1 Hash_table_size=1048576 load=0.51395 Shape=2 Hash_table_size=2097152 load=0.729353 Shape=3 Hash_table_size=2097152 load=0.616794 Shape=4 Hash_table_size=1048576 load=0.755696 Shape=5 Hash_table_size=2097152 load=0.500302 Shape=6 Hash_table_size=1048576 load=0.513579 Shape=7 Hash_table_size=1048576 load=0.757405 Shape=8 Hash_table_size=2097152 load=0.616603 Shape=9 Hash_table_size=524288 load=0.520599 Shape=10 Hash_table_size=2097152 load=0.499904 Shape=11 Hash_table_size=524288 load=0.521265 Shape=12 Hash_table_size=524288 load=0.521044 Shape=13 Hash_table_size=1048576 load=0.513451 Shape=14 Hash_table_size=1048576 load=0.756829 Shape=15 Hash_table_size=524288 load=0.520765 [1.895s] Algorithm: Query-indexed Shape configuration: 1011110111,110100100010111,11001011111,101110001111,11011101100001,1111010010101,111001001001011,10101001101011,111101010011,1111000010000111,1100011011011,1101010000011011,1110001010101001,110011000110011,11011010001101,1101001100010011 Building query histograms... [0.022s] Allocating buffers... [0.001s] Current RSS: 57.3 MB, Peak RSS: 163.7 MB Loading reference sequences... [169.499s] Sequences = 34281384, letters = 15925806162, average length = 464 Current RSS: 16.1 GB, Peak RSS: 17.4 GB Initializing dictionary... [0.324s] Initializing temporary storage... Async_buffer() 58020,907 [17.563s] Building reference histograms... [281.692s] Allocating buffers... [0s] Current RSS: 16.3 GB, Peak RSS: 17.4 GB Processing query block 1, reference block 1/1, shape 1/16. Building reference seed array... [33.443s] Building query seed array... [0.006s] Indexed query seeds = 1964731/5569920 (35.27%), reference seeds = 3529502409/15925806162 (22.16%) Soft masked letters = 0/5569920 (0.00%), 0/15925806162 (0.00%) Computing hash join... [7.458s] Searching alignments... [47.437s] Current RSS: 46.9 GB, Peak RSS: 47.0 GB Processing query block 1, reference block 1/1, shape 2/16. Building reference seed array... [19.711s] Building query seed array... [0.003s] Indexed query seeds = 557057/5569920 (10.00%), reference seeds = 1633619685/15925806162 (10.26%) Soft masked letters = 0/5569920 (0.00%), 0/15925806162 (0.00%) Computing hash join... [2.624s] Searching alignments... [16.461s] Current RSS: 46.9 GB, Peak RSS: 47.0 GB Processing query block 1, reference block 1/1, shape 3/16. Building reference seed array... [32.214s] Building query seed array... [0.004s] Indexed query seeds = 1676195/5569920 (30.09%), reference seeds = 4739401098/15925806162 (29.76%) Soft masked letters = 0/5569920 (0.00%), 0/15925806162 (0.00%) Computing hash join... [8.304s] Searching alignments... [41.581s] Current RSS: 57.2 GB, Peak RSS: 57.3 GB Processing query block 1, reference block 1/1, shape 4/16. Building reference seed array... [27.774s] Building query seed array... [0.004s] Indexed query seeds = 1395710/5569920 (25.06%), reference seeds = 3264580563/15925806162 (20.50%) Soft masked letters = 0/5569920 (0.00%), 0/15925806162 (0.00%) Computing hash join... [6.18s] Searching alignments... [35.744s] Current RSS: 57.2 GB, Peak RSS: 57.3 GB Processing query block 1, reference block 1/1, shape 5/16. Building reference seed array... [27.555s] Building query seed array... [0.003s] Indexed query seeds = 836312/5569920 (15.01%), reference seeds = 4223840231/15925806162 (26.52%) Soft masked letters = 0/5569920 (0.00%), 0/15925806162 (0.00%) Computing hash join... [6.269s] Searching alignments... [23.921s] Current RSS: 57.3 GB, Peak RSS: 57.3 GB Processing query block 1, reference block 1/1, shape 6/16. Building reference seed array... [24.479s] Building query seed array... [0.004s] Indexed query seeds = 1115848/5569920 (20.03%), reference seeds = 2470291739/15925806162 (15.51%) Soft masked letters = 0/5569920 (0.00%), 0/15925806162 (0.00%) Computing hash join... [4.519s] Searching alignments... [29.645s] Current RSS: 57.3 GB, Peak RSS: 57.3 GB Processing query block 1, reference block 1/1, shape 7/16. Building reference seed array... [19.662s] Building query seed array... [0.003s] Indexed query seeds = 557057/5569920 (10.00%), reference seeds = 1623147117/15925806162 (10.19%) Soft masked letters = 0/5569920 (0.00%), 0/15925806162 (0.00%) Computing hash join... [2.61s] Searching alignments... [16.262s] Current RSS: 57.3 GB, Peak RSS: 57.3 GB Processing query block 1, reference block 1/1, shape 8/16. Building reference seed array... [27.523s] Building query seed array... [0.003s] Indexed query seeds = 836312/5569920 (15.01%), reference seeds = 4211237997/15925806162 (26.44%) Soft masked letters = 0/5569920 (0.00%), 0/15925806162 (0.00%) Computing hash join... [6.258s] Searching alignments... [23.325s] Current RSS: 57.3 GB, Peak RSS: 57.4 GB Processing query block 1, reference block 1/1, shape 9/16. Building reference seed array... [26.971s] Building query seed array... [0.004s] Indexed query seeds = 1395710/5569920 (25.06%), reference seeds = 3269626447/15925806162 (20.53%) Soft masked letters = 0/5569920 (0.00%), 0/15925806162 (0.00%) Computing hash join... [6.178s] Searching alignments... [35.752s] Current RSS: 57.4 GB, Peak RSS: 57.4 GB Processing query block 1, reference block 1/1, shape 10/16. Building reference seed array... [16.128s] Building query seed array... [0.002s] Indexed query seeds = 278282/5569920 (5.00%), reference seeds = 1155471529/15925806162 (7.26%) Soft masked letters = 0/5569920 (0.00%), 0/15925806162 (0.00%) Computing hash join... [1.711s] Searching alignments... [8.786s] Current RSS: 57.4 GB, Peak RSS: 57.4 GB Processing query block 1, reference block 1/1, shape 11/16. Building reference seed array... [25.313s] Building query seed array... [0.004s] Indexed query seeds = 1115848/5569920 (20.03%), reference seeds = 2488695988/15925806162 (15.63%) Soft masked letters = 0/5569920 (0.00%), 0/15925806162 (0.00%) Computing hash join... [4.563s] Searching alignments... [29.898s] Current RSS: 57.4 GB, Peak RSS: 57.4 GB Processing query block 1, reference block 1/1, shape 12/16. Building reference seed array... [16.055s] Building query seed array... [0.003s] Indexed query seeds = 278282/5569920 (5.00%), reference seeds = 1141469238/15925806162 (7.17%) Soft masked letters = 0/5569920 (0.00%), 0/15925806162 (0.00%) Computing hash join... [1.699s] Searching alignments... [8.707s] Current RSS: 57.4 GB, Peak RSS: 57.4 GB Processing query block 1, reference block 1/1, shape 13/16. Building reference seed array... [16.153s] Building query seed array... [0.002s] Indexed query seeds = 278282/5569920 (5.00%), reference seeds = 1143891953/15925806162 (7.18%) Soft masked letters = 0/5569920 (0.00%), 0/15925806162 (0.00%) Computing hash join... [1.7s] Searching alignments... [8.834s] Current RSS: 57.4 GB, Peak RSS: 57.4 GB Processing query block 1, reference block 1/1, shape 14/16. Building reference seed array... [19.577s] Building query seed array... [0.003s] Indexed query seeds = 557057/5569920 (10.00%), reference seeds = 1636228494/15925806162 (10.27%) Soft masked letters = 0/5569920 (0.00%), 0/15925806162 (0.00%) Computing hash join... [2.635s] Searching alignments... [16.488s] Current RSS: 57.4 GB, Peak RSS: 57.5 GB Processing query block 1, reference block 1/1, shape 15/16. Building reference seed array... [27.729s] Building query seed array... [0.003s] Indexed query seeds = 836312/5569920 (15.01%), reference seeds = 4228100081/15925806162 (26.55%) Soft masked letters = 0/5569920 (0.00%), 0/15925806162 (0.00%) Computing hash join... [6.262s] Searching alignments... [23.259s] Current RSS: 57.4 GB, Peak RSS: 57.5 GB Processing query block 1, reference block 1/1, shape 16/16. Building reference seed array... [16.199s] Building query seed array... [0.003s] Indexed query seeds = 278282/5569920 (5.00%), reference seeds = 1168417845/15925806162 (7.34%) Soft masked letters = 0/5569920 (0.00%), 0/15925806162 (0.00%) Computing hash join... [1.742s] Searching alignments... [8.949s] Deallocating buffers... [4.419s] Clearing query masking... [0.004s] Computing alignments... Async_buffer.load() 77126567(1.07745 GB, 0.51207 GB on disk) Loading trace points... [14.515s] Sorting trace points... [0.432s] Computing alignments... Queries=793 size=7.66797 max_size=9.85938 next=seq_794 ETA=72.1652s Queries=2720 size=5.48047 max_size=16.3711 next=seq_2721 ETA=40.6618s Queries=5157 size=5.79688 max_size=16.3711 next=seq_5158 ETA=30.7522s Queries=10974 size=2.63281 max_size=16.3711 next=seq_10975 ETA=17.1482s Queries=13880 size=5.30859 max_size=16.3711 next=seq_13881 ETA=15.9006s Queries=17120 size=15.4727 max_size=16.3711 next=seq_17121 ETA=14.3341s Queries=20376 size=10.9062 max_size=22.2227 next=seq_20377 ETA=12.9323s Queries=24706 size=1.91016 max_size=22.2227 next=seq_24707 ETA=10.7873s Queries=28903 size=0.476562 max_size=28.8828 next=seq_28904 ETA=9.06664s Queries=32437 size=0.894531 max_size=28.8828 next=seq_32438 ETA=7.88698s Queries=38546 size=0.367188 max_size=28.8828 next=seq_38547 ETA=5.55736s Queries=44913 size=0.574219 max_size=28.8828 next=seq_44914 ETA=3.50197s Queries=48746 size=3.02734 max_size=28.8828 next=seq_48747 ETA=2.47327s Queries=52772 size=5.58984 max_size=28.8828 next=seq_52773 ETA=1.39225s [15.092s] Deallocating buffers... [0.044s] Loading trace points... [0s] [31.656s] Deallocating reference... [0.625s] Loading reference sequences... [0s] Current RSS: 2.0 GB, Peak RSS: 57.5 GB Deallocating buffers... [0.001s] Current RSS: 2.0 GB, Peak RSS: 57.5 GB Deallocating queries... [0s] Loading query sequences... [0.041s] Closing the input file... [0s] Closing the output file... Loading dictionary... [0.166s]

Bus error

bbuchfink commented 2 years ago

Have you tried using a dmnd database instead of BLAST? BLAST databases rely on mmap which often does not work well together with certain storage types.

kevfly16 commented 2 years ago

I just tried using a dmnd database and it seems to work now. I did some research on FUSE and mmap, and it does seem that there may be some issues. I saw here that using a private flag for mmap can help make FUSE compatible. Is that possible to implement within this project?

Separately, when I use the diamond view command on the output from the run, I'm getting an Input/output error. Does this similarly have something to do with using FUSE and mmap?

diamond view \ -a /mnt/s3fs/.../out.daa \ --top 5 \ --out /mnt/s3fs/.../out.ur100 \ --outfmt 6 qseqid sseqid pident length mismatch gapopen qstart qend sstart send evalue bitscore ppos qframe score salltitles diamond v2.0.13.151 (C) Max Planck Society for the Advancement of Science Documentation, support and updates available at http://www.diamondsearch.org Please cite: http://dx.doi.org/10.1038/s41592-021-01101-x Nature Methods (2021)

CPU threads: 16

Loading subject IDs... [16.661s] Scoring parameters: (Matrix=blosum62 Lambda=0.267 K=0.041 Penalties=11/1) DB sequences = 34281384 DB sequences used = 4257343 DB letters = 15925806162 Percentage range of top alignment score to report hits: 5 Generating output... Input/output error Error writing file /mnt/s3fs/.../out.ur100 terminate called after throwing an instance of 'File_write_exception' what(): Error writing file /mnt/s3fs/.../out.ur100 Aborted

bbuchfink commented 2 years ago

I just tried using a dmnd database and it seems to work now. I did some research on FUSE and mmap, and it does seem that there may be some issues. I saw here that using a private flag for mmap can help make FUSE compatible. Is that possible to implement within this project?

I would assume the mapping is already private. I'm using the NCBI library for accessing BLAST databases so it's not that easily modified.

Separately, when I use the diamond view command on the output from the run, I'm getting an Input/output error. Does this similarly have something to do with using FUSE and mmap?

No, diamond view does not use mmap, so this has to be a separate issue. I have no idea at the moment what may be causing this.

diamond view -a /mnt/s3fs/.../out.daa --top 5 --out /mnt/s3fs/.../out.ur100 --outfmt 6 qseqid sseqid pident length mismatch gapopen qstart qend sstart send evalue bitscore ppos qframe score salltitles diamond v2.0.13.151 (C) Max Planck Society for the Advancement of Science Documentation, support and updates available at http://www.diamondsearch.org Please cite: http://dx.doi.org/10.1038/s41592-021-01101-x Nature Methods (2021)

CPU threads: 16

Loading subject IDs... [16.661s] Scoring parameters: (Matrix=blosum62 Lambda=0.267 K=0.041 Penalties=11/1) DB sequences = 34281384 DB sequences used = 4257343 DB letters = 15925806162 Percentage range of top alignment score to report hits: 5 Generating output... Input/output error Error writing file /mnt/s3fs/.../out.ur100 terminate called after throwing an instance of 'File_write_exception' what(): Error writing file /mnt/s3fs/.../out.ur100 Aborted