DaehwanKimLab / centrifuge

Classifier for metagenomic sequences
GNU General Public License v3.0
246 stars 73 forks source link

centrifuge-inspect error "Assertion `false' failed" #179

Open joshua-theisen opened 5 years ago

joshua-theisen commented 5 years ago

I built a centrifuge database using the following code (as part of a slurm script, if that matters): make THREADS=24 IDX_NAME=bact_arch_hum COMPLETE_GENOMES=archaea COMPLETE_GENOMES=bacteria MAMMALIAN_TAXIDS=9606

It seemed to work but the stderr (full stderr at the bottom below) had these errors:

rmdir: failed to remove ‘tmp_bact_arch_hum’: Directory not empty
make: *** [bact_arch_hum.1.cf] Error 1

I tried to verify the index using centrifuge-inspect bact_arch_hum but got the following error:

centrifuge-inspect: word_io.h:142: uint32_t readU32(FILE*, bool): Assertion `false' failed.
Aborted (core dumped)

centrifuge-inspect -n bact_arch_hum gave:

assert_eq: expected (8, 0x8) got (0, 0x0)
word_io.h:315
centrifuge-inspect: word_io.h:315: index_t readIndex(std::istream&, bool) [with index_t = long unsigned int; std::istream = std::basic_istream<char>]: Assertion `0' failed.
Aborted (core dumped)

What do these errors mean? Thanks Josh

stderr (grep -v Progress slurm.710054.err):

Downloading ftp://ftp.ncbi.nlm.nih.gov/genomes/refseq/bacteria/assembly_summary.txt ...
Downloading 14800 bacteria genomes at assembly level Complete Genome ... (will take a while)
Downloading ftp://ftp.ncbi.nlm.nih.gov/genomes/refseq/vertebrate_mammalian/assembly_summary.txt ...
Downloading 1 vertebrate_mammalian genomes at assembly level Chromosome ... (will take a while)
Downloading NCBI taxonomy ...

real    651m2.676s
user    4133m38.893s
sys     2084m2.603s
rmdir: failed to remove ‘tmp_bact_arch_hum’: Directory not empty
make: *** [bact_arch_hum.1.cf] Error 1
stdout:

``` $ cat slurm.710054.out Fri Oct 4 14:33:08 CDT 2019 slurm settings: #SBATCH -p dragen2 --mem=80GB -c 12 calling centrifuge index Makefile with make THREADS=24 IDX_NAME=bact_arch_hum COMPLETE_GENOMES=archaea COMPLETE_GENOMES=bacteria MAMMALIAN_TAXIDS=9606 mkdir -p reference-sequences [[ -d tmp_bact_arch_hum ]] && rm -rf tmp_bact_arch_hum; mkdir -p tmp_bact_arch_hum Downloading and dust-masking bacteria centrifuge-download -o tmp_bact_arch_hum -m -d "bacteria" -P 24 refseq > \ tmp_bact_arch_hum/all-bacteria.map find tmp_bact_arch_hum/bacteria -name "*.fna" | xargs cat > reference-sequences/all-bacteria.fna.tmp && mv reference-sequences/all-bacteria.fna.tmp reference-sequences/all-bacteria.fna mv tmp_bact_arch_hum/all-bacteria.map reference-sequences/all-bacteria.map rm -rf tmp_bact_arch_hum centrifuge-download -o tmp_bact_arch_hum -d "vertebrate_mammalian" -a "Chromosome" -t 9606 -c 'reference genome' -P 24 refseq > \ tmp_bact_arch_hum/mammalian-reference-9606.map find tmp_bact_arch_hum/vertebrate_mammalian -name "*.fna" | xargs cat > reference-sequences/mammalian-reference-9606.fna.tmp && mv reference-sequences/mammalian-reference-9606.fna.tmp reference-sequences/mammalian-reference-9606.fna mv tmp_bact_arch_hum/mammalian-reference-9606.map reference-sequences/mammalian-reference-9606.map rm -rf tmp_bact_arch_hum Found centrifuge-download and centrifuge-build. [[ -d tmp_bact_arch_hum ]] && rm -rf tmp_bact_arch_hum; mkdir -p tmp_bact_arch_hum centrifuge-download -o tmp_bact_arch_hum/taxonomy taxonomy nodes.dmp names.dmp mkdir -p taxonomy mv tmp_bact_arch_hum/taxonomy/* taxonomy && rmdir tmp_bact_arch_hum/taxonomy && rmdir tmp_bact_arch_hum Index building prerequisites: reference-sequences/all-bacteria.fna reference-sequences/mammalian-reference-9606.fna reference-sequences/all-bacteria.map reference-sequences/mammalian-reference-9606.map taxonomy/nodes.dmp taxonomy/names.dmp [[ -d tmp_bact_arch_hum ]] && rm -rf tmp_bact_arch_hum; mkdir -p tmp_bact_arch_hum time centrifuge-build -p 24 \ --conversion-table <(cat reference-sequences/all-bacteria.map reference-sequences/mammalian-reference-9606.map) \ --taxonomy-tree taxonomy/nodes.dmp --name-table taxonomy/names.dmp \ reference-sequences/all-bacteria.fna,reference-sequences/mammalian-reference-9606.fna tmp_bact_arch_hum/bact_arch_hum 2>&1 | tee centrifuge-build-bact_arch_hum.log Settings: Output files: "tmp_bact_arch_hum/bact_arch_hum.*.cf" Line rate: 7 (line is 128 bytes) Lines per side: 1 (side is 128 bytes) Offset rate: 4 (one in 16) FTable chars: 10 Strings: unpacked Local offset rate: 3 (one in 8) Local fTable chars: 6 Max bucket size: default Max bucket size, sqrt multiplier: default Max bucket size, len divisor: 4 Difference-cover sample period: 1024 Endianness: little Actual local endianness: little Sanity checking: disabled Assertions: disabled Random seed: 0 Sizeofs: void*:8, int:4, long:8, size_t:8 Input files DNA, FASTA: reference-sequences/all-bacteria.fna reference-sequences/mammalian-reference-9606.fna Reading reference sizes Warning: Encountered reference sequence with only gaps Time reading reference sizes: 00:04:41 Calculating joined length Writing header Reserving space for joined string Joining reference sequences Time to join reference sequences: 00:04:22 bmax according to bmaxDivN setting: 2464501935 Using parameters --bmax 1848376452 --dcv 1024 Doing ahead-of-time memory usage test Passed! Constructing with these parameters: --bmax 1848376452 --dcv 1024 Constructing suffix-array element generator Building DifferenceCoverSample Building sPrime Building sPrimeOrder V-Sorting samples V-Sorting samples time: 01:02:53 Allocating rank array Ranking v-sort output Ranking v-sort output time: 01:48:01 Invoking Larsson-Sadakane on ranks Invoking Larsson-Sadakane on ranks time: 00:33:08 Sanity-checking and returning Building samples Reserving space for 68 sample suffixes Generating random suffixes QSorting 68 sample offsets, eliminating duplicates QSorting sample offsets, eliminating duplicates time: 00:00:00 Multikey QSorting 68 samples (Using difference cover) Multikey QSorting samples time: 00:00:00 Calculating bucket sizes Splitting and merging Splitting and merging time: 00:00:00 Split 7, merged 28; iterating... Splitting and merging Splitting and merging time: 00:00:00 Split 3, merged 3; iterating... Splitting and merging Splitting and merging time: 00:00:00 Split 1, merged 1; iterating... Splitting and merging Splitting and merging time: 00:00:00 Split 1, merged 1; iterating... Splitting and merging Splitting and merging time: 00:00:00 Split 1, merged 1; iterating... Avg bucket size: 1.27912e+09 (target: 1848376451) Converting suffix-array elements to index image Allocating ftab, absorbFtab Entering Ebwt loop Getting block 1 of 48 Reserving size (1848376452) for bucket 1 Calculating Z arrays for bucket 1 Entering block accumulator loop for bucket 1: Getting block 2 of 48 Reserving size (1848376452) for bucket 2 Getting block 3 of 48 Reserving size (1848376452) for bucket 3 Getting block 4 of 48 Reserving size (1848376452) for bucket 4 Calculating Z arrays for bucket 3 Calculating Z arrays for bucket 4 Calculating Z arrays for bucket 2 Getting block 5 of 48 Reserving size (1848376452) for bucket 5 Entering block accumulator loop for bucket 3: Calculating Z arrays for bucket 5 Entering block accumulator loop for bucket 2: Getting block 7 of 48 Reserving size (1848376452) for bucket 7 Entering block accumulator loop for bucket 5: Getting block 8 of 48 Reserving size (1848376452) for bucket 8 Entering block accumulator loop for bucket 4: Calculating Z arrays for bucket 8 Calculating Z arrays for bucket 7 Getting block 9 of 48 Reserving size (1848376452) for bucket 9 Entering block accumulator loop for bucket 8: Entering block accumulator loop for bucket 7: Getting block 6 of 48 Reserving size (1848376452) for bucket 6 Getting block 10 of 48 Reserving size (1848376452) for bucket 10 Getting block 11 of 48 Reserving size (1848376452) for bucket 11 Calculating Z arrays for bucket 9 Calculating Z arrays for bucket 6 Calculating Z arrays for bucket 11 Calculating Z arrays for bucket 10 Entering block accumulator loop for bucket 11: Entering block accumulator loop for bucket 9: Entering block accumulator loop for bucket 6: Getting block 12 of 48 Reserving size (1848376452) for bucket 12 Entering block accumulator loop for bucket 10: Calculating Z arrays for bucket 12 Getting block 14 of 48 Getting block 13 of 48 Getting block 15 of 48 Reserving size (1848376452) for bucket 15 Reserving size (1848376452) for bucket 14 Reserving size (1848376452) for bucket 13 Calculating Z arrays for bucket 14 Calculating Z arrays for bucket 13 Calculating Z arrays for bucket 15 Entering block accumulator loop for bucket 13: Entering block accumulator loop for bucket 14: Entering block accumulator loop for bucket 15: Entering block accumulator loop for bucket 12: Getting block 16 of 48 Reserving size (1848376452) for bucket 16 Getting block 17 of 48 Reserving size (1848376452) for bucket 17 Getting block 18 of 48 Reserving size (1848376452) for bucket 18 Calculating Z arrays for bucket 17 Calculating Z arrays for bucket 16 Entering block accumulator loop for bucket 17: Calculating Z arrays for bucket 18 Entering block accumulator loop for bucket 16: Entering block accumulator loop for bucket 18: Getting block 20 of 48 Reserving size (1848376452) for bucket 20 Getting block 19 of 48 Reserving size (1848376452) for bucket 19 Calculating Z arrays for bucket 19 Getting block 21 of 48 Calculating Z arrays for bucket 20 Reserving size (1848376452) for bucket 21 Getting block 22 of 48 Reserving size (1848376452) for bucket 22 Calculating Z arrays for bucket 21 Entering block accumulator loop for bucket 19: Calculating Z arrays for bucket 22 Entering block accumulator loop for bucket 20: Getting block 23 of 48 Reserving size (1848376452) for bucket 23 Entering block accumulator loop for bucket 21: Calculating Z arrays for bucket 23 Getting block 24 of 48 Reserving size (1848376452) for bucket 24 Entering block accumulator loop for bucket 22: Calculating Z arrays for bucket 24 Entering block accumulator loop for bucket 24: Entering block accumulator loop for bucket 23: bucket 1: 10% bucket 2: 10% bucket 3: 10% bucket 5: 10% bucket 4: 10% bucket 9: 10% bucket 6: 10% bucket 7: 10% bucket 10: 10% bucket 8: 10% bucket 14: 10% bucket 12: 10% bucket 16: 10% bucket 13: 10% bucket 15: 10% bucket 19: 10% bucket 17: 10% bucket 23: 10% bucket 11: 10% bucket 21: 10% bucket 20: 10% bucket 22: 10% bucket 18: 10% bucket 24: 10% bucket 2: 20% bucket 1: 20% bucket 3: 20% bucket 5: 20% bucket 9: 20% bucket 4: 20% bucket 6: 20% bucket 7: 20% bucket 10: 20% bucket 8: 20% bucket 14: 20% bucket 12: 20% bucket 16: 20% bucket 13: 20% bucket 15: 20% bucket 11: 20% bucket 19: 20% bucket 17: 20% bucket 23: 20% bucket 21: 20% bucket 20: 20% bucket 22: 20% bucket 18: 20% bucket 24: 20% bucket 1: 30% bucket 2: 30% bucket 6: 30% bucket 3: 30% bucket 4: 30% bucket 9: 30% bucket 7: 30% bucket 5: 30% bucket 10: 30% bucket 8: 30% bucket 14: 30% bucket 12: 30% bucket 16: 30% bucket 13: 30% bucket 11: 30% bucket 15: 30% bucket 17: 30% bucket 19: 30% bucket 23: 30% bucket 21: 30% bucket 20: 30% bucket 22: 30% bucket 18: 30% bucket 24: 30% bucket 3: 40% bucket 8: 40% bucket 1: 40% bucket 10: 40% bucket 7: 40% bucket 4: 40% bucket 5: 40% bucket 9: 40% bucket 2: 40% bucket 14: 40% bucket 6: 40% bucket 11: 40% bucket 16: 40% bucket 12: 40% bucket 13: 40% bucket 15: 40% bucket 17: 40% bucket 23: 40% bucket 19: 40% bucket 21: 40% bucket 22: 40% bucket 20: 40% bucket 24: 40% bucket 18: 40% bucket 16: 50% bucket 9: 50% bucket 7: 50% bucket 10: 50% bucket 12: 50% bucket 13: 50% bucket 11: 50% bucket 8: 50% bucket 5: 50% bucket 6: 50% bucket 14: 50% bucket 4: 50% bucket 3: 50% bucket 2: 50% bucket 1: 50% bucket 15: 50% bucket 17: 50% bucket 19: 50% bucket 23: 50% bucket 21: 50% bucket 22: 50% bucket 20: 50% bucket 18: 50% bucket 24: 50% bucket 1: 60% bucket 10: 60% bucket 3: 60% bucket 6: 60% bucket 19: 60% bucket 9: 60% bucket 17: 60% bucket 8: 60% bucket 5: 60% bucket 16: 60% bucket 2: 60% bucket 7: 60% bucket 4: 60% bucket 15: 60% bucket 13: 60% bucket 14: 60% bucket 12: 60% bucket 11: 60% bucket 23: 60% bucket 21: 60% bucket 20: 60% bucket 22: 60% bucket 18: 60% bucket 24: 60% bucket 1: 70% bucket 19: 70% bucket 12: 70% bucket 9: 70% bucket 10: 70% bucket 17: 70% bucket 2: 70% bucket 11: 70% bucket 16: 70% bucket 14: 70% bucket 7: 70% bucket 3: 70% bucket 5: 70% bucket 15: 70% bucket 4: 70% bucket 23: 70% bucket 13: 70% bucket 8: 70% bucket 6: 70% bucket 22: 70% bucket 21: 70% bucket 18: 70% bucket 20: 70% bucket 24: 70% bucket 2: 80% bucket 14: 80% bucket 1: 80% bucket 10: 80% bucket 6: 80% bucket 17: 80% bucket 7: 80% bucket 3: 80% bucket 5: 80% bucket 16: 80% bucket 19: 80% bucket 23: 80% bucket 22: 80% bucket 12: 80% bucket 8: 80% bucket 11: 80% bucket 4: 80% bucket 15: 80% bucket 9: 80% bucket 18: 80% bucket 21: 80% bucket 20: 80% bucket 13: 80% bucket 24: 80% bucket 9: 90% bucket 7: 90% bucket 11: 90% bucket 3: 90% bucket 16: 90% bucket 19: 90% bucket 4: 90% bucket 8: 90% bucket 2: 90% bucket 13: 90% bucket 12: 90% bucket 23: 90% bucket 6: 90% bucket 10: 90% bucket 14: 90% bucket 15: 90% bucket 21: 90% bucket 22: 90% bucket 5: 90% bucket 1: 90% bucket 17: 90% bucket 18: 90% bucket 20: 90% bucket 24: 90% bucket 2: 100% bucket 23: 100% bucket 11: 100% bucket 5: 100% bucket 21: 100% bucket 18: 100% bucket 17: 100% Sorting block of length 1043306676 for bucket 11 (Using difference cover) bucket 6: 100% bucket 13: 100% bucket 3: 100% bucket 9: 100% bucket 1: 100% Sorting block of length 1799179557 for bucket 18 (Using difference cover) Sorting block of length 1131640639 for bucket 1 Sorting block of length 1179649060 for bucket 3 (Using difference cover) (Using difference cover) bucket 15: 100% bucket 22: 100% Sorting block of length 1189174664 for bucket 21 (Using difference cover) Sorting block of length 1260059511 for bucket 6 (Using difference cover) bucket 24: 100% Sorting block of length 1613245191 for bucket 22 (Using difference cover) bucket 7: 100% Sorting block of length 1788437355 for bucket 2 (Using difference cover) bucket 10: 100% Sorting block of length 1090342388 for bucket 9 (Using difference cover) bucket 19: 100% Sorting block of length 1511804208 for bucket 10 (Using difference cover) bucket 12: 100% bucket 14: 100% Sorting block of length 1306401358 for bucket 13 Sorting block of length 1161258863 for bucket 23 (Using difference cover) (Using difference cover) bucket 16: 100% Sorting block of length 1202439757 for bucket 17 (Using difference cover) Sorting block of length 805472944 for bucket 5 (Using difference cover) Sorting block of length 1281570120 for bucket 12 (Using difference cover) Sorting block of length 1243472121 for bucket 15 (Using difference cover) Sorting block of length 1007854714 for bucket 16 (Using difference cover) Sorting block of length 1804181805 for bucket 7 (Using difference cover) Sorting block of length 1747380289 for bucket 24 (Using difference cover) bucket 8: 100% Sorting block of length 679979403 for bucket 14 (Using difference cover) bucket 4: 100% bucket 20: 100% Sorting block of length 1685983021 for bucket 8 (Using difference cover) Sorting block of length 786794844 for bucket 19 (Using difference cover) Sorting block of length 1661014495 for bucket 4 (Using difference cover) Sorting block of length 1514459760 for bucket 20 (Using difference cover) Sorting block time: 04:11:49 Returning block of 679979404 for bucket 14 Getting block 25 of 48 Reserving size (1848376452) for bucket 25 Calculating Z arrays for bucket 25 Entering block accumulator loop for bucket 25: bucket 25: 10% bucket 25: 20% bucket 25: 30% Sorting block time: 04:28:43 Returning block of 786794845 for bucket 19 bucket 25: 40% bucket 25: 50% Getting block 26 of 48 Reserving size (1848376452) for bucket 26 Calculating Z arrays for bucket 26 Entering block accumulator loop for bucket 26: Sorting block time: 04:34:06 Returning block of 805472945 for bucket 5 bucket 25: 60% bucket 26: 10% bucket 25: 70% bucket 26: 20% Getting block 27 of 48 Reserving size (1848376452) for bucket 27 Calculating Z arrays for bucket 27 Entering block accumulator loop for bucket 27: bucket 26: 30% bucket 25: 80% bucket 27: 10% bucket 26: 40% bucket 25: 90% bucket 27: 20% bucket 26: 50% bucket 25: 100% Sorting block of length 1638591599 for bucket 25 (Using difference cover) bucket 27: 30% bucket 26: 60% bucket 27: 40% bucket 26: 70% bucket 27: 50% bucket 26: 80% bucket 26: 90% bucket 27: 60% bucket 26: 100% Sorting block of length 448589774 for bucket 26 (Using difference cover) bucket 27: 70% bucket 27: 80% bucket 27: 90% bucket 27: 100% Sorting block of length 647188665 for bucket 27 (Using difference cover) Sorting block time: 05:24:50 Returning block of 1007854715 for bucket 16 Getting block 28 of 48 Reserving size (1848376452) for bucket 28 Calculating Z arrays for bucket 28 Entering block accumulator loop for bucket 28: bucket 28: 10% bucket 28: 20% bucket 28: 30% bucket 28: 40% bucket 28: 50% bucket 28: 60% bucket 28: 70% Sorting block time: 05:54:42 Returning block of 1043306677 for bucket 11 bucket 28: 80% mv tmp_bact_arch_hum/bact_arch_hum.*.cf . && rmdir tmp_bact_arch_hum call to centrifuge index Makefile with make THREADS=24 IDX_NAME=bact_arch_hum COMPLETE_GENOMES=archaea COMPLETE_GENOMES=bacteria MAMMALIAN_TAXIDS=9606 completed Sat Oct 5 01:55:31 CDT 2019 ```

ghost commented 5 years ago

It seemed to work

did you get your task done?


centrifuge-inspect -n bact_arch_hum gave:

https://github.com/DaehwanKimLab/centrifuge/blob/master/word_io.h#L315 is mentioned in the error by word_io.h:315

assert_eq: expected (8, 0x8) got (0, 0x0)

the program expected both values to be equal, but they were not.


This code is hard to understand, seems it was not designed to be read by anyone other than the authors due to constraints. I know not what the coder wanted to do. the hard to understand part reminds me of python codebases.

After s second look, I think will need to read multiple files to get a better look at it. Its easy to understand in the local blocks.

ghost commented 5 years ago

rmdir: failed to remove ‘tmp_bact_arch_hum’: Directory not empty make: *** [bact_arch_hum.1.cf] Error 1

don't know what part of the code is relevant here. but a directory that should have been empty was not, and so it wasn't deleted like it normally would be.


this seems to be the main file https://github.com/DaehwanKimLab/centrifuge/blob/master/centrifuge it should be centrifuge.pl, the extension name for perl files.

this file is easier to read. its where the inputs to the command are processed. temporary files are used and last commit changed number of child processes to 1.

also, using perl or python or any scripting language to do this, in my opinion is a bit reckless . maybe consider https://golang.org/ for concurrency, some people like its channels, is easier to install than perl.


That's all.


@trip23app

joshua-theisen commented 5 years ago

@trip23app , thanks for digging into those errors.

When I said "it seemed to work", I meant that I didn't get a "job failed" message from slurm. The task was building the index. I ran centrifuge-inspect bact_arch_hum and centrifuge-inspect -n bact_arch_hum to test whether the index was built correctly, but those tests failed, so I don't know if the index was built correctly.

Here are the contents of the directory where I tried to build the index:

$ ls -lh
-rw-rw-r-- 1 user.name root  729M Oct  4 15:13 bact_arch_hum.1.cf                                                                               
-rw-rw-r-- 1 user.name root     0 Oct  4 15:09 bact_arch_hum.2.cf                                                                               
-rw-rw-r-- 1 user.name root  1.3M Oct  4 15:13 bact_arch_hum.3.cf                                                                               
-rw-rw-r-- 1 user.name root     0 Oct  4 18:55 bact_arch_hum.4.cf                                                                               
-rw-rw-r-- 1 user.name root   14K Oct  5 01:11 centrifuge-build-bact_arch_hum.log                                                               
-rw-rw-r-- 1 user.name root   13K Oct  4 13:53 Makefile                                                                                         
drwxrwsr-x 2 user.name root  4.0K Oct  4 15:04 reference-sequences
drwxrwsr-x 2 user.name root  4.0K Oct  4 15:04 taxonomy
drwxrwsr-x 2 user.name root  4.0K Oct  5 01:55 tmp_bact_arch_hum

The 0 kb bact_arch_hum.*.cf files make me think the index was not built correctly. Also, the centrifuge-build-bact_arch_hum.log file (below) ends with Returning block of 1043306677 for bucket 11 bucket 28: 80% which makes me think it wasn't done when it exited.

Here is the full log:

``` $ cat centrifuge-build-bact_arch_hum.log Settings: Output files: "tmp_bact_arch_hum/bact_arch_hum.*.cf" Line rate: 7 (line is 128 bytes) Lines per side: 1 (side is 128 bytes) Offset rate: 4 (one in 16) FTable chars: 10 Strings: unpacked Local offset rate: 3 (one in 8) Local fTable chars: 6 Max bucket size: default Max bucket size, sqrt multiplier: default Max bucket size, len divisor: 4 Difference-cover sample period: 1024 Endianness: little Actual local endianness: little Sanity checking: disabled Assertions: disabled Random seed: 0 Sizeofs: void*:8, int:4, long:8, size_t:8 Input files DNA, FASTA: reference-sequences/all-bacteria.fna reference-sequences/mammalian-reference-9606.fna Reading reference sizes Warning: Encountered reference sequence with only gaps Time reading reference sizes: 00:04:41 Calculating joined length Writing header Reserving space for joined string Joining reference sequences Time to join reference sequences: 00:04:22 bmax according to bmaxDivN setting: 2464501935 Using parameters --bmax 1848376452 --dcv 1024 Doing ahead-of-time memory usage test Passed! Constructing with these parameters: --bmax 1848376452 --dcv 1024 Constructing suffix-array element generator Building DifferenceCoverSample Building sPrime Building sPrimeOrder V-Sorting samples V-Sorting samples time: 01:02:53 Allocating rank array Ranking v-sort output Ranking v-sort output time: 01:48:01 Invoking Larsson-Sadakane on ranks Invoking Larsson-Sadakane on ranks time: 00:33:08 Sanity-checking and returning Building samples Reserving space for 68 sample suffixes Generating random suffixes QSorting 68 sample offsets, eliminating duplicates QSorting sample offsets, eliminating duplicates time: 00:00:00 Multikey QSorting 68 samples (Using difference cover) Multikey QSorting samples time: 00:00:00 Calculating bucket sizes Splitting and merging Splitting and merging time: 00:00:00 Split 7, merged 28; iterating... Splitting and merging Splitting and merging time: 00:00:00 Split 3, merged 3; iterating... Splitting and merging Splitting and merging time: 00:00:00 Split 1, merged 1; iterating... Splitting and merging Splitting and merging time: 00:00:00 Split 1, merged 1; iterating... Splitting and merging Splitting and merging time: 00:00:00 Split 1, merged 1; iterating... Avg bucket size: 1.27912e+09 (target: 1848376451) Converting suffix-array elements to index image Allocating ftab, absorbFtab Entering Ebwt loop Getting block 1 of 48 Reserving size (1848376452) for bucket 1 Calculating Z arrays for bucket 1 Entering block accumulator loop for bucket 1: Getting block 2 of 48 Reserving size (1848376452) for bucket 2 Getting block 3 of 48 Reserving size (1848376452) for bucket 3 Getting block 4 of 48 Reserving size (1848376452) for bucket 4 Calculating Z arrays for bucket 3 Calculating Z arrays for bucket 4 Calculating Z arrays for bucket 2 Getting block 5 of 48 Reserving size (1848376452) for bucket 5 Entering block accumulator loop for bucket 3: Calculating Z arrays for bucket 5 Entering block accumulator loop for bucket 2: Getting block 7 of 48 Reserving size (1848376452) for bucket 7 Entering block accumulator loop for bucket 5: Getting block 8 of 48 Reserving size (1848376452) for bucket 8 Entering block accumulator loop for bucket 4: Calculating Z arrays for bucket 8 Calculating Z arrays for bucket 7 Getting block 9 of 48 Reserving size (1848376452) for bucket 9 Entering block accumulator loop for bucket 8: Entering block accumulator loop for bucket 7: Getting block 6 of 48 Reserving size (1848376452) for bucket 6 Getting block 10 of 48 Reserving size (1848376452) for bucket 10 Getting block 11 of 48 Reserving size (1848376452) for bucket 11 Calculating Z arrays for bucket 9 Calculating Z arrays for bucket 6 Calculating Z arrays for bucket 11 Calculating Z arrays for bucket 10 Entering block accumulator loop for bucket 11: Entering block accumulator loop for bucket 9: Entering block accumulator loop for bucket 6: Getting block 12 of 48 Reserving size (1848376452) for bucket 12 Entering block accumulator loop for bucket 10: Calculating Z arrays for bucket 12 Getting block 14 of 48 Getting block 13 of 48 Getting block 15 of 48 Reserving size (1848376452) for bucket 15 Reserving size (1848376452) for bucket 14 Reserving size (1848376452) for bucket 13 Calculating Z arrays for bucket 14 Calculating Z arrays for bucket 13 Calculating Z arrays for bucket 15 Entering block accumulator loop for bucket 13: Entering block accumulator loop for bucket 14: Entering block accumulator loop for bucket 15: Entering block accumulator loop for bucket 12: Getting block 16 of 48 Reserving size (1848376452) for bucket 16 Getting block 17 of 48 Reserving size (1848376452) for bucket 17 Getting block 18 of 48 Reserving size (1848376452) for bucket 18 Calculating Z arrays for bucket 17 Calculating Z arrays for bucket 16 Entering block accumulator loop for bucket 17: Calculating Z arrays for bucket 18 Entering block accumulator loop for bucket 16: Entering block accumulator loop for bucket 18: Getting block 20 of 48 Reserving size (1848376452) for bucket 20 Getting block 19 of 48 Reserving size (1848376452) for bucket 19 Calculating Z arrays for bucket 19 Getting block 21 of 48 Calculating Z arrays for bucket 20 Reserving size (1848376452) for bucket 21 Getting block 22 of 48 Reserving size (1848376452) for bucket 22 Calculating Z arrays for bucket 21 Entering block accumulator loop for bucket 19: Calculating Z arrays for bucket 22 Entering block accumulator loop for bucket 20: Getting block 23 of 48 Reserving size (1848376452) for bucket 23 Entering block accumulator loop for bucket 21: Calculating Z arrays for bucket 23 Getting block 24 of 48 Reserving size (1848376452) for bucket 24 Entering block accumulator loop for bucket 22: Calculating Z arrays for bucket 24 Entering block accumulator loop for bucket 24: Entering block accumulator loop for bucket 23: bucket 1: 10% bucket 2: 10% bucket 3: 10% bucket 5: 10% bucket 4: 10% bucket 9: 10% bucket 6: 10% bucket 7: 10% bucket 10: 10% bucket 8: 10% bucket 14: 10% bucket 12: 10% bucket 16: 10% bucket 13: 10% bucket 15: 10% bucket 19: 10% bucket 17: 10% bucket 23: 10% bucket 11: 10% bucket 21: 10% bucket 20: 10% bucket 22: 10% bucket 18: 10% bucket 24: 10% bucket 2: 20% bucket 1: 20% bucket 3: 20% bucket 5: 20% bucket 9: 20% bucket 4: 20% bucket 6: 20% bucket 7: 20% bucket 10: 20% bucket 8: 20% bucket 14: 20% bucket 12: 20% bucket 16: 20% bucket 13: 20% bucket 15: 20% bucket 11: 20% bucket 19: 20% bucket 17: 20% bucket 23: 20% bucket 21: 20% bucket 20: 20% bucket 22: 20% bucket 18: 20% bucket 24: 20% bucket 1: 30% bucket 2: 30% bucket 6: 30% bucket 3: 30% bucket 4: 30% bucket 9: 30% bucket 7: 30% bucket 5: 30% bucket 10: 30% bucket 8: 30% bucket 14: 30% bucket 12: 30% bucket 16: 30% bucket 13: 30% bucket 11: 30% bucket 15: 30% bucket 17: 30% bucket 19: 30% bucket 23: 30% bucket 21: 30% bucket 20: 30% bucket 22: 30% bucket 18: 30% bucket 24: 30% bucket 3: 40% bucket 8: 40% bucket 1: 40% bucket 10: 40% bucket 7: 40% bucket 4: 40% bucket 5: 40% bucket 9: 40% bucket 2: 40% bucket 14: 40% bucket 6: 40% bucket 11: 40% bucket 16: 40% bucket 12: 40% bucket 13: 40% bucket 15: 40% bucket 17: 40% bucket 23: 40% bucket 19: 40% bucket 21: 40% bucket 22: 40% bucket 20: 40% bucket 24: 40% bucket 18: 40% bucket 16: 50% bucket 9: 50% bucket 7: 50% bucket 10: 50% bucket 12: 50% bucket 13: 50% bucket 11: 50% bucket 8: 50% bucket 5: 50% bucket 6: 50% bucket 14: 50% bucket 4: 50% bucket 3: 50% bucket 2: 50% bucket 1: 50% bucket 15: 50% bucket 17: 50% bucket 19: 50% bucket 23: 50% bucket 21: 50% bucket 22: 50% bucket 20: 50% bucket 18: 50% bucket 24: 50% bucket 1: 60% bucket 10: 60% bucket 3: 60% bucket 6: 60% bucket 19: 60% bucket 9: 60% bucket 17: 60% bucket 8: 60% bucket 5: 60% bucket 16: 60% bucket 2: 60% bucket 7: 60% bucket 4: 60% bucket 15: 60% bucket 13: 60% bucket 14: 60% bucket 12: 60% bucket 11: 60% bucket 23: 60% bucket 21: 60% bucket 20: 60% bucket 22: 60% bucket 18: 60% bucket 24: 60% bucket 1: 70% bucket 19: 70% bucket 12: 70% bucket 9: 70% bucket 10: 70% bucket 17: 70% bucket 2: 70% bucket 11: 70% bucket 16: 70% bucket 14: 70% bucket 7: 70% bucket 3: 70% bucket 5: 70% bucket 15: 70% bucket 4: 70% bucket 23: 70% bucket 13: 70% bucket 8: 70% bucket 6: 70% bucket 22: 70% bucket 21: 70% bucket 18: 70% bucket 20: 70% bucket 24: 70% bucket 2: 80% bucket 14: 80% bucket 1: 80% bucket 10: 80% bucket 6: 80% bucket 17: 80% bucket 7: 80% bucket 3: 80% bucket 5: 80% bucket 16: 80% bucket 19: 80% bucket 23: 80% bucket 22: 80% bucket 12: 80% bucket 8: 80% bucket 11: 80% bucket 4: 80% bucket 15: 80% bucket 9: 80% bucket 18: 80% bucket 21: 80% bucket 20: 80% bucket 13: 80% bucket 24: 80% bucket 9: 90% bucket 7: 90% bucket 11: 90% bucket 3: 90% bucket 16: 90% bucket 19: 90% bucket 4: 90% bucket 8: 90% bucket 2: 90% bucket 13: 90% bucket 12: 90% bucket 23: 90% bucket 6: 90% bucket 10: 90% bucket 14: 90% bucket 15: 90% bucket 21: 90% bucket 22: 90% bucket 5: 90% bucket 1: 90% bucket 17: 90% bucket 18: 90% bucket 20: 90% bucket 24: 90% bucket 2: 100% bucket 23: 100% bucket 11: 100% bucket 5: 100% bucket 21: 100% bucket 18: 100% bucket 17: 100% Sorting block of length 1043306676 for bucket 11 (Using difference cover) bucket 6: 100% bucket 13: 100% bucket 3: 100% bucket 9: 100% bucket 1: 100% Sorting block of length 1799179557 for bucket 18 (Using difference cover) Sorting block of length 1131640639 for bucket 1 Sorting block of length 1179649060 for bucket 3 (Using difference cover) (Using difference cover) bucket 15: 100% bucket 22: 100% Sorting block of length 1189174664 for bucket 21 (Using difference cover) Sorting block of length 1260059511 for bucket 6 (Using difference cover) bucket 24: 100% Sorting block of length 1613245191 for bucket 22 (Using difference cover) bucket 7: 100% Sorting block of length 1788437355 for bucket 2 (Using difference cover) bucket 10: 100% Sorting block of length 1090342388 for bucket 9 (Using difference cover) bucket 19: 100% Sorting block of length 1511804208 for bucket 10 (Using difference cover) bucket 12: 100% bucket 14: 100% Sorting block of length 1306401358 for bucket 13 Sorting block of length 1161258863 for bucket 23 (Using difference cover) (Using difference cover) bucket 16: 100% Sorting block of length 1202439757 for bucket 17 (Using difference cover) Sorting block of length 805472944 for bucket 5 (Using difference cover) Sorting block of length 1281570120 for bucket 12 (Using difference cover) Sorting block of length 1243472121 for bucket 15 (Using difference cover) Sorting block of length 1007854714 for bucket 16 (Using difference cover) Sorting block of length 1804181805 for bucket 7 (Using difference cover) Sorting block of length 1747380289 for bucket 24 (Using difference cover) bucket 8: 100% Sorting block of length 679979403 for bucket 14 (Using difference cover) bucket 4: 100% bucket 20: 100% Sorting block of length 1685983021 for bucket 8 (Using difference cover) Sorting block of length 786794844 for bucket 19 (Using difference cover) Sorting block of length 1661014495 for bucket 4 (Using difference cover) Sorting block of length 1514459760 for bucket 20 (Using difference cover) Sorting block time: 04:11:49 Returning block of 679979404 for bucket 14 Getting block 25 of 48 Reserving size (1848376452) for bucket 25 Calculating Z arrays for bucket 25 Entering block accumulator loop for bucket 25: bucket 25: 10% bucket 25: 20% bucket 25: 30% Sorting block time: 04:28:43 Returning block of 786794845 for bucket 19 bucket 25: 40% bucket 25: 50% Getting block 26 of 48 Reserving size (1848376452) for bucket 26 Calculating Z arrays for bucket 26 Entering block accumulator loop for bucket 26: Sorting block time: 04:34:06 Returning block of 805472945 for bucket 5 bucket 25: 60% bucket 26: 10% bucket 25: 70% bucket 26: 20% Getting block 27 of 48 Reserving size (1848376452) for bucket 27 Calculating Z arrays for bucket 27 Entering block accumulator loop for bucket 27: bucket 26: 30% bucket 25: 80% bucket 27: 10% bucket 26: 40% bucket 25: 90% bucket 27: 20% bucket 26: 50% bucket 25: 100% Sorting block of length 1638591599 for bucket 25 (Using difference cover) bucket 27: 30% bucket 26: 60% bucket 27: 40% bucket 26: 70% bucket 27: 50% bucket 26: 80% bucket 26: 90% bucket 27: 60% bucket 26: 100% Sorting block of length 448589774 for bucket 26 (Using difference cover) bucket 27: 70% bucket 27: 80% bucket 27: 90% bucket 27: 100% Sorting block of length 647188665 for bucket 27 (Using difference cover) Sorting block time: 05:24:50 Returning block of 1007854715 for bucket 16 Getting block 28 of 48 Reserving size (1848376452) for bucket 28 Calculating Z arrays for bucket 28 Entering block accumulator loop for bucket 28: bucket 28: 10% bucket 28: 20% bucket 28: 30% bucket 28: 40% bucket 28: 50% bucket 28: 60% bucket 28: 70% Sorting block time: 05:54:42 Returning block of 1043306677 for bucket 11 bucket 28: 80% ```

ghost commented 5 years ago

looking at https://github.com/DaehwanKimLab/centrifuge/blob/master/word_io.h#L27 leads to the assert functions https://github.com/DaehwanKimLab/centrifuge/blob/master/assert_helpers.h#L71

https://github.com/DaehwanKimLab/centrifuge/blob/master/word_io.h#L28 leads to https://github.com/DaehwanKimLab/centrifuge/blob/master/endian_swap.h


The 0 kb bact_arch_hum.*.cf files make me think the index was not built correctly. Also, the centrifuge-build-bact_arch_hum.log file (below) ends with Returning block of 1043306677 for bucket 11 bucket 28: 80% which makes me think it wasn't done when it exited.

I agree. But I can't figure out exactly what went wrong and the chain of events.

this snippet will help manage the readability for really long log outputs https://gist.github.com/joyrexus/16041f2426450e73f5df9391f7f7ae5f , as currently it takes a long time to scroll down.


Aborted (core dumped) almost forgot about the core dumps. please see for using gdb to read it https://stackoverflow.com/questions/8305866/how-do-i-analyze-a-programs-core-dump-file-with-gdb-when-it-has-command-line-pa

or just paste the core dump to provide debug info. unless it gets deleted by the program, see https://stackoverflow.com/questions/2065912/core-dumped-but-core-file-is-not-in-the-current-directory.


I'm not an expert at any of this and may stop responding abruptly. So, if you or any one else wants to tackle this, here's a quick reference to get you started

some of the source code files refer to bowtie 2 - https://github.com/BenLangmead/bowtie2.

you can try to repeat it a second time to see if it still doesn't work. Reproducible bugs are easier to investigate.

this is all i can do. all the best.


@trip23app

joshua-theisen commented 5 years ago

@trip23app, thanks for the help. That snippet makes things a lot more readable. The core dump file is not in the working directory; I'm searching for it.