marbl / metAMOS

A metagenomic and isolate assembly and analysis pipeline built with AMOS
http://marbl.github.io/metAMOS
Other
93 stars 45 forks source link

Metamos pipeline aborted at mapreads #159

Closed jvollme closed 10 years ago

jvollme commented 10 years ago

Hi, my metamos pipeline aborted at the mapreads step. The error message states that the input for mapreads contained to many (more that 2³²) characters. I checked the intermediate assembly file (soapdenovo.31.asm.contig) and found that it , indeed, contains 61.05.647.921 sequence characters in all contigs combined.

The input-data is okay as I have used it in parallel with other assemblers/ assembly piplines. Do I need to manually add a step to break up the assemblies into multiple contig-files for the pipeline to run?

Here is the exact error message:


****ERROR****** During mapreads, the following command failed with return code 1:

/work/cluster/home/jov14/tools/metAMOS-1.5rc3/Utilities/cpp/Linux-x86_64/bowtie-build -o 2 /home/jov14/projects/pratscher_USCa_metagenome/results/assemblies/180914/assembly_untrimmed_metamos_SOAP/Assemble/out/soapdenovo.31.asm.contig /home/jov14/projects/pratscher_USCa_metagenome/results/assemblies/180914/assembly_untrimmed_metamos_SOAP/Assemble/out/soapdenovo.31.IDX

****DETAILS****** Last 10 commands run before the error (/home/jov14/projects/pratscher_USCa_metagenome/results/assemblies/180914/assembly_untrimmed_metamos_SOAP/Logs/COMMANDS.log) |2014-09-20 02:59:02| touch /home/jov14/projects/pratscher_USCa_metagenome/results/assemblies/180914/assembly_untrimmed_metamos_SOAP/Assemble/out/soapdenovo.31.run |2014-09-20 02:59:03| cat /home/jov14/projects/pratscher_USCa_metagenome/results/assemblies/180914/assembly_untrimmed_metamos_SOAP/soapconfig.txt |grep -v max_rd_len > /home/jov14/projects/pratscher_USCa_metagenome/results/assemblies/180914/assembly_untrimmed_metamos_SOAP/soap2config.txt |2014-09-21 01:25:18| /work/cluster/home/jov14/tools/metAMOS-1.5rc3/Utilities/cpp/Linux-x86_64/SOAPdenovo-63mer pregraph -p 10 -K 31 -R -D -s /home/jov14/projects/pratscher_USCa_metagenome/results/assemblies/180914/assembly_untrimmed_metamos_SOAP/soapconfig.txt -o /home/jov14/projects/pratscher_USCa_metagenome/results/assemblies/180914/assembly_untrimmed_metamos_SOAP/Assemble/out/soapdenovo.31.asm |2014-09-21 02:51:19| /work/cluster/home/jov14/tools/metAMOS-1.5rc3/Utilities/cpp/Linux-x86_64/SOAPdenovo-63mer contig -g /home/jov14/projects/pratscher_USCa_metagenome/results/assemblies/180914/assembly_untrimmed_metamos_SOAP/Assemble/out/soapdenovo.31.asm -R -M 3 |2014-09-21 02:51:20| mv /home/jov14/projects/pratscher_USCa_metagenome/results/assemblies/180914/assembly_untrimmed_metamos_SOAP/Assemble/out/soapdenovo.31.asm.contig /home/jov14/projects/pratscher_USCa_metagenome/results/assemblies/180914/assembly_untrimmed_metamos_SOAP/Assemble/out/soapdenovo.31.asm.contigWIUPAC.fa |2014-09-21 03:20:32| java -cp /work/cluster/home/jov14/tools/metAMOS-1.5rc3/Utilities/java:. RemoveIUPAC /home/jov14/projects/pratscher_USCa_metagenome/results/assemblies/180914/assembly_untrimmed_metamos_SOAP/Assemble/out/soapdenovo.31.asm.contigWIUPAC.fa >/home/jov14/projects/pratscher_USCa_metagenome/results/assemblies/180914/assembly_untrimmed_metamos_SOAP/Assemble/out/soapdenovo.31.asm.contig |2014-09-21 03:21:17| rm /home/jov14/projects/pratscher_USCa_metagenome/results/assemblies/180914/assembly_untrimmed_metamos_SOAP/Assemble/out/soapdenovo.31.run |2014-09-21 03:21:17| touch /home/jov14/projects/pratscher_USCa_metagenome/results/assemblies/180914/assembly_untrimmed_metamos_SOAP/Assemble/out/assemble.success |2014-09-21 03:30:00|# [MAPREADS] |2014-09-21 03:31:57| /work/cluster/home/jov14/tools/metAMOS-1.5rc3/Utilities/cpp/Linux-x86_64/bowtie-build -o 2 /home/jov14/projects/pratscher_USCa_metagenome/results/assemblies/180914/assembly_untrimmed_metamos_SOAP/Assemble/out/soapdenovo.31.asm.contig /home/jov14/projects/pratscher_USCa_metagenome/results/assemblies/180914/assembly_untrimmed_metamos_SOAP/Assemble/out/soapdenovo.31.IDX

Last 10 lines of output (/home/jov14/projects/pratscher_USCa_metagenome/results/assemblies/180914/assembly_untrimmed_metamos_SOAP/Logs/MAPREADS.log) Sizeofs: void*:8, int:4, long:8, size_t:8 Input files DNA, FASTA: /home/jov14/projects/pratscher_USCa_metagenome/results/assemblies/180914/assembly_untrimmed_metamos_SOAP/Assemble/out/soapdenovo.31.asm.contig Reading reference sizes Time reading reference sizes: 00:01:55 Total time for call to driver() for forward index: 00:01:55 Error: Reference sequence has more than 2^32-1 characters! Please divide the reference into batches or chunks of about 3.6 billion characters or less each and index each independently. Command: /work/cluster/home/jov14/tools/metAMOS-1.5rc3/Utilities/cpp/Linux-x86_64/bowtie-build -o 2 /home/jov14/projects/pratscher_USCa_metagenome/results/assemblies/180914/assembly_untrimmed_metamos_SOAP/Assemble/out/soapdenovo.31.asm.contig /home/jov14/projects/pratscher_USCa_metagenome/results/assemblies/180914/assembly_untrimmed_metamos_SOAP/Assemble/out/soapdenovo.31.IDX

Please veryify input data and restart MetAMOS. If the problem persists please contact the MetAMOS development team. ****ERROR******


skoren commented 10 years ago

Hi,

This message is not due to metAMOS but is a limitation of Bowtie's index construction. This is also a limitation of the currently supported version of Bowtie2 in metAMOS.

As a workaround, you can either update the Bowtie2 version in Utilities/cpp/Linux-x86_64/bowtie2* to at least 2.2 which should support indices larger than 4GB and select bowtie2 for map reads (-m bowtie2). Alternatively, you could try other assemblers supported by metAMOS to see if they generate a different assembly smaller than 4GB in size.

jvollme commented 10 years ago

Thanks, I've updated the Bowtie2 version and will restart the assemblies with a range of assemblers (this may take a while).

But concerning the updating of metamos-pipeline components, I've got a further question: I've installed the metamos pipeline with all optional workflows. This includes the refseq and swissprot databases. In order to save space and also have the newest reference annotations I want to simply softlink the swissprot files in "~/metAMOS-1.5rc3/Utilities/DB/uniprot_sprot*" to a local copy of the swissprot blast-database and swissprot.dat-file, which we keep up to date on our servers.

However I do not know how to proceed with the files "uniprot_sprot_enz_set", "uniprot_sprot_enz_set.putative", "uniprot_sprot_non_enz_set", "uniprot_sprot_non_enz_set.putative".

Are these created from the "uniprot_sprot.dat"-file using some kind of custom scripts? How can I keep these compatible with our local version of the swissprot-database?

with friendly greetings, John

On 09/29/2014 09:52 PM, Sergey Koren wrote:

Hi,

This message is not due to metAMOS but is a limitation of Bowtie's index construction. This is also a limitation of the currently supported version of Bowtie2 in metAMOS.

As a workaround, you can either update the Bowtie2 version in Utilities/cpp/Linux-x86_64/bowtie2* to at least 2.2 which should support indices larger than 4GB and select bowtie2 for map reads (-m bowtie2). Alternatively, you could try other assemblers supported by metAMOS to see if they generate a different assembly smaller than 4GB in size.

— Reply to this email directly or view it on GitHub https://github.com/marbl/metAMOS/issues/159#issuecomment-57218401.

sejmodha commented 10 years ago

Hi There,

I was facing the same issue and replaced bowtie2 executables with the newer version (2.2.3) but I'm still facing the same problem.

Here is the details about the error.

Completed Task = assemble.Assemble * MetAMOS Warning: abyss assembler did not run successfully! Job = [[abyss.64.asm.contig, idba-ud.64.asm.contig] -> [assemble.ok]] completed Completed Task = assemble.CheckAsmResults Uptodate Task = assemble.SplitMappers Starting Task = mapreads.MAPREADS [*** ****ERROR****** During mapreads, the following command failed with return code 1:

/software/metAMOS-1.5rc3/Utilities/cpp/Linux-x86_64/bowtie2 -p 6 -D 15 -R 2 -N 0 -L 20 -i S,1,1.10 --un /home/modh01s/Gavin/3855/3855_metamos_64/Assemble/out/idba-ud.64.lib1.unaligned.seq /home/modh01s/Gavin/3855/3855_metamos_64/Assemble/out/idba-ud.64.IDX /home/modh01s/Gavin/3855/3855_metamos_64/Preprocess/out/lib1.seq -S /home/modh01s/Gavin/3855/3855_metamos_64/Assemble/out/lib1.sam

****DETAILS****** Last 10 commands run before the error (/home/modh01s/Gavin/3855/3855_metamos_64/Logs/COMMANDS.log) |2014-10-14 11:52:45| rm /home/modh01s/Gavin/3855/3855_metamos_64/Assemble/out/abyss.64.asm.contig |2014-10-14 11:52:45| rm /home/modh01s/Gavin/3855/3855_metamos_64/Assemble/out/abyss.64.run |2014-10-14 11:52:45| touch /home/modh01s/Gavin/3855/3855_metamos_64/Assemble/out/abyss.64.failed |2014-10-14 11:52:45| mv /home/modh01s/Gavin/3855/3855_metamos_64/Assemble/out/idba-ud.64.asm.contig /home/modh01s/Gavin/3855/3855_metamos_64/Assemble/out/idba-ud.64.asm.contigWIUPAC.fa |2014-10-14 11:52:46| java -cp /software/metAMOS-1.5rc3/Utilities/java:. RemoveIUPAC /home/modh01s/Gavin/3855/3855_metamos_64/Assemble/out/idba-ud.64.asm.contigWIUPAC.fa >/home/modh01s/Gavin/3855/3855_metamos_64/Assemble/out/idba-ud.64.asm.contig |2014-10-14 11:52:46| rm /home/modh01s/Gavin/3855/3855_metamos_64/Assemble/out/idba-ud.64.run |2014-10-14 11:52:46| touch /home/modh01s/Gavin/3855/3855_metamos_64/Assemble/out/assemble.success |2014-10-14 11:52:48|# [MAPREADS] |2014-10-14 11:52:48| /software/metAMOS-1.5rc3/Utilities/cpp/Linux-x86_64/bowtie2-build -o 2 /home/modh01s/Gavin/3855/3855_metamos_64/Assemble/out/idba-ud.64.asm.contig /home/modh01s/Gavin/3855/3855_metamos_64/Assemble/out/idba-ud.64.IDX |2014-10-14 11:52:48| /software/metAMOS-1.5rc3/Utilities/cpp/Linux-x86_64/bowtie2 -p 6 -D 15 -R 2 -N 0 -L 20 -i S,1,1.10 --un /home/modh01s/Gavin/3855/3855_metamos_64/Assemble/out/idba-ud.64.lib1.unaligned.seq /home/modh01s/Gavin/3855/3855_metamos_64/Assemble/out/idba-ud.64.IDX /home/modh01s/Gavin/3855/3855_metamos_64/Preprocess/out/lib1.seq -S /home/modh01s/Gavin/3855/3855_metamos_64/Assemble/out/lib1.sam

Last 10 lines of output (/home/modh01s/Gavin/3855/3855_metamos_64/Logs/MAPREADS.log) --reorder force SAM output order to match order of input reads --mm use memory-mapped I/O for index; many 'bowtie's can share

Other: --qc-filter filter out reads that are bad according to QSEQ filter --seed seed for random number generator (0) --non-deterministic seed rand. gen. arbitrarily instead of using read attributes --version print version information and quit -h/--help print this usage message (ERR): bowtie2-align exited with value 1

Please veryify input data and restart MetAMOS. If the problem persists please contact the MetAMOS development team. ****ERROR******


Last run bowtie2 command was

software/metAMOS-1.5rc3/Utilities/cpp/Linux-x86_64/bowtie2 -p 6 -D 15 -R 2 -N 0 -L 20 -i S,1,1.10 --un /home/modh01s/Gavin/3855/3855_metamos_64/Assemble/out/idba-ud.64.lib1.unaligned.seq /home/modh01s/Gavin/3855/3855_metamos_64/Assemble/out/idba-ud.64.IDX /home/modh01s/Gavin/3855/3855_metamos_64/Preprocess/out/lib1.seq -S /home/modh01s/Gavin/3855/3855_metamos_64/Assemble/out/lib1.sam

and if I run the same command again, outside the pipeline it says No index, query, or output file specified!

Any help would be appreciated.

Thanks Sej

skoren commented 10 years ago

This looks like an error due to a change in Bowtie 2 parameters in the version you installed. Do you have the output of MAPREADS.log or the results if you manually run the command: software/metAMOS-1.5rc3/Utilities/cpp/Linux-x86_64/bowtie2 -p 6 -D 15 -R 2 -N 0 -L 20 -i S,1,1.10 --un /home/modh01s/Gavin/3855/3855_metamos_64/Assemble/out/idba-ud.64.lib1.unaligned.seq /home/modh01s/Gavin/3855/3855_metamos_64/Assemble/out/idba-ud.64.IDX /home/modh01s/Gavin/3855/3855_metamos_64/Preprocess/out/lib1.seq -S /home/modh01s/Gavin/3855/3855_metamos_64/Assemble/out/lib1.sam

sejmodha commented 10 years ago

Hi Koren,

I reckon it's definitely something to do with the parameters in the newer version of bowtie2. Here is the details from MAPREADS.log

Settings: Output files: "/home/modh01s/Gavin/3855/3855_metamos64/Assemble/out/idba-ud.64.IDX..bt2" Line rate: 6 (line is 64 bytes) Lines per side: 1 (side is 64 bytes) Offset rate: 2 (one in 4) FTable chars: 10 Strings: unpacked Max bucket size: default Max bucket size, sqrt multiplier: default Max bucket size, len divisor: 4 Difference-cover sample period: 1024 Endianness: little Actual local endianness: little Sanity checking: disabled Assertions: disabled Random seed: 0 Sizeofs: void_:8, int:4, long:8, size_t:8 Input files DNA, FASTA: /home/modh01s/Gavin/3855/3855_metamos_64/Assemble/out/idba-ud.64.asm.contig Reading reference sizes Time reading reference sizes: 00:00:00 Calculating joined length Writing header Reserving space for joined string Joining reference sequences Time to join reference sequences: 00:00:00 bmax according to bmaxDivN setting: 63603 Using parameters --bmax 47703 --dcv 1024 Doing ahead-of-time memory usage test Passed! Constructing with these parameters: --bmax 47703 --dcv 1024 Constructing suffix-array element generator Building DifferenceCoverSample Building sPrime Building sPrimeOrder V-Sorting samples V-Sorting samples time: 00:00:00 Allocating rank array Ranking v-sort output Ranking v-sort output time: 00:00:00 Invoking Larsson-Sadakane on ranks Invoking Larsson-Sadakane on ranks time: 00:00:00 Sanity-checking and returning Building samples Reserving space for 12 sample suffixes Generating random suffixes QSorting 12 sample offsets, eliminating duplicates QSorting sample offsets, eliminating duplicates time: 00:00:00 Multikey QSorting 12 samples (Using difference cover) Multikey QSorting samples time: 00:00:00 Calculating bucket sizes Binary sorting into buckets 10% 20% 30% 40% 50% 60% 70% 80% 90% 100% Binary sorting into buckets time: 00:00:00 Splitting and merging Splitting and merging time: 00:00:00 Avg bucket size: 31800.6 (target: 47702) Converting suffix-array elements to index image Allocating ftab, absorbFtab Entering Ebwt loop Getting block 1 of 8 Reserving size (47703) for bucket Calculating Z arrays Calculating Z arrays time: 00:00:00 Entering block accumulator loop: 10% 20% 30% 40% 50% 60% 70% 80% 90% 100% Block accumulator loop time: 00:00:00 Sorting block of length 28523 (Using difference cover) Sorting block time: 00:00:00 Returning block of 28524 Getting block 2 of 8 Reserving size (47703) for bucket Calculating Z arrays Calculating Z arrays time: 00:00:00 Entering block accumulator loop: 10% 20% 30% 40% 50% 60% 70% 80% 90% 100% Block accumulator loop time: 00:00:00 Sorting block of length 40875 (Using difference cover) Sorting block time: 00:00:00 Returning block of 40876 Getting block 3 of 8 Reserving size (47703) for bucket Calculating Z arrays Calculating Z arrays time: 00:00:00 Entering block accumulator loop: 10% 20% 30% 40% 50% 60% 70% 80% 90% 100% Block accumulator loop time: 00:00:00 Sorting block of length 35926 (Using difference cover) Sorting block time: 00:00:00 Returning block of 35927 Getting block 4 of 8 Reserving size (47703) for bucket Calculating Z arrays Calculating Z arrays time: 00:00:00 Entering block accumulator loop: 10% 20% 30% 40% 50% 60% 70% 80% 90% 100% Block accumulator loop time: 00:00:00 Sorting block of length 22729 (Using difference cover) Sorting block time: 00:00:00 Returning block of 22730 Getting block 5 of 8 Reserving size (47703) for bucket Calculating Z arrays Calculating Z arrays time: 00:00:00 Entering block accumulator loop: 10% 20% 30% 40% 50% 60% 70% 80% 90% 100% Block accumulator loop time: 00:00:00 Sorting block of length 36757 (Using difference cover) Sorting block time: 00:00:00 Returning block of 36758 Getting block 6 of 8 Reserving size (47703) for bucket Calculating Z arrays Calculating Z arrays time: 00:00:00 Entering block accumulator loop: 10% 20% 30% 40% 50% 60% 70% 80% 90% 100% Block accumulator loop time: 00:00:00 Sorting block of length 40953 (Using difference cover) Sorting block time: 00:00:00 Returning block of 40954 Getting block 7 of 8 Reserving size (47703) for bucket Calculating Z arrays Calculating Z arrays time: 00:00:00 Entering block accumulator loop: 10% 20% 30% 40% 50% 60% 70% 80% 90% 100% Block accumulator loop time: 00:00:00 Sorting block of length 11501 (Using difference cover) Sorting block time: 00:00:00 Returning block of 11502 Getting block 8 of 8 Reserving size (47703) for bucket Calculating Z arrays Calculating Z arrays time: 00:00:00 Entering block accumulator loop: 10% 20% 30% 40% 50% 60% 70% 80% 90% 100% Block accumulator loop time: 00:00:00 Sorting block of length 37141 (Using difference cover) Sorting block time: 00:00:00 Returning block of 37142 Exited Ebwt loop fchr[A]: 0 fchr[C]: 56058 fchr[G]: 126443 fchr[T]: 199249 fchr[$]: 254412 Exiting Ebwt::buildToDisk() Returning from initFromVector Wrote 4281013 bytes to primary EBWT file: /home/modh01s/Gavin/3855/3855_metamos_64/Assemble/out/idba-ud.64.IDX.1.bt2 Wrote 254420 bytes to secondary EBWT file: /home/modh01s/Gavin/3855/3855_metamos_64/Assemble/out/idba-ud.64.IDX.2.bt2 Re-opening _in1 and _in2 as input streams Returning from Ebwt constructor Headers: len: 254412 bwtLen: 254413 sz: 63603 bwtSz: 63604 lineRate: 6 offRate: 2 offMask: 0xfffffffc ftabChars: 10 eftabLen: 20 eftabSz: 80 ftabLen: 1048577 ftabSz: 4194308 offsLen: 63604 offsSz: 254416 lineSz: 64 sideSz: 64 sideBwtSz: 48 sideBwtLen: 192 numSides: 1326 numLines: 1326 ebwtTotLen: 84864 ebwtTotSz: 84864 color: 0 reverse: 0 Total time for call to driver() for forward index: 00:00:00 Reading reference sizes Time reading reference sizes: 00:00:00 Calculating joined length Writing header Reserving space for joined string Joining reference sequences Time to join reference sequences: 00:00:00 Time to reverse reference sequence: 00:00:00 bmax according to bmaxDivN setting: 63603 Using parameters --bmax 47703 --dcv 1024 Doing ahead-of-time memory usage test Passed! Constructing with these parameters: --bmax 47703 --dcv 1024 Constructing suffix-array element generator Building DifferenceCoverSample Building sPrime Building sPrimeOrder V-Sorting samples V-Sorting samples time: 00:00:00 Allocating rank array Ranking v-sort output Ranking v-sort output time: 00:00:00 Invoking Larsson-Sadakane on ranks Invoking Larsson-Sadakane on ranks time: 00:00:00 Sanity-checking and returning Building samples Reserving space for 12 sample suffixes Generating random suffixes QSorting 12 sample offsets, eliminating duplicates QSorting sample offsets, eliminating duplicates time: 00:00:00 Multikey QSorting 12 samples (Using difference cover) Multikey QSorting samples time: 00:00:00 Calculating bucket sizes Binary sorting into buckets 10% 20% 30% 40% 50% 60% 70% 80% 90% 100% Binary sorting into buckets time: 00:00:00 Splitting and merging Splitting and merging time: 00:00:00 Avg bucket size: 36343.7 (target: 47702) Converting suffix-array elements to index image Allocating ftab, absorbFtab Entering Ebwt loop Getting block 1 of 7 Reserving size (47703) for bucket Calculating Z arrays Calculating Z arrays time: 00:00:00 Entering block accumulator loop: 10% 20% 30% 40% 50% 60% 70% 80% 90% 100% Block accumulator loop time: 00:00:00 Sorting block of length 47093 (Using difference cover) Sorting block time: 00:00:00 Returning block of 47094 Getting block 2 of 7 Reserving size (47703) for bucket Calculating Z arrays Calculating Z arrays time: 00:00:00 Entering block accumulator loop: 10% 20% 30% 40% 50% 60% 70% 80% 90% 100% Block accumulator loop time: 00:00:00 Sorting block of length 30600 (Using difference cover) Sorting block time: 00:00:00 Returning block of 30601 Getting block 3 of 7 Reserving size (47703) for bucket Calculating Z arrays Calculating Z arrays time: 00:00:00 Entering block accumulator loop: 10% 20% 30% 40% 50% 60% 70% 80% 90% 100% Block accumulator loop time: 00:00:00 Sorting block of length 39049 (Using difference cover) Sorting block time: 00:00:00 Returning block of 39050 Getting block 4 of 7 Reserving size (47703) for bucket Calculating Z arrays Calculating Z arrays time: 00:00:00 Entering block accumulator loop: 10% 20% 30% 40% 50% 60% 70% 80% 90% 100% Block accumulator loop time: 00:00:00 Sorting block of length 47459 (Using difference cover) Sorting block time: 00:00:00 Returning block of 47460 Getting block 5 of 7 Reserving size (47703) for bucket Calculating Z arrays Calculating Z arrays time: 00:00:00 Entering block accumulator loop: 10% 20% 30% 40% 50% 60% 70% 80% 90% 100% Block accumulator loop time: 00:00:00 Sorting block of length 26190 (Using difference cover) Sorting block time: 00:00:00 Returning block of 26191 Getting block 6 of 7 Reserving size (47703) for bucket Calculating Z arrays Calculating Z arrays time: 00:00:00 Entering block accumulator loop: 10% 20% 30% 40% 50% 60% 70% 80% 90% 100% Block accumulator loop time: 00:00:00 Sorting block of length 43057 (Using difference cover) Sorting block time: 00:00:00 Returning block of 43058 Getting block 7 of 7 Reserving size (47703) for bucket Calculating Z arrays Calculating Z arrays time: 00:00:00 Entering block accumulator loop: 10% 20% 30% 40% 50% 60% 70% 80% 90% 100% Block accumulator loop time: 00:00:00 Sorting block of length 20958 (Using difference cover) Sorting block time: 00:00:00 Returning block of 20959 Exited Ebwt loop fchr[A]: 0 fchr[C]: 56058 fchr[G]: 126443 fchr[T]: 199249 fchr[$]: 254412 Exiting Ebwt::buildToDisk() Returning from initFromVector Wrote 4281013 bytes to primary EBWT file: /home/modh01s/Gavin/3855/3855_metamos_64/Assemble/out/idba-ud.64.IDX.rev.1.bt2 Wrote 254420 bytes to secondary EBWT file: /home/modh01s/Gavin/3855/3855_metamos_64/Assemble/out/idba-ud.64.IDX.rev.2.bt2 Re-opening _in1 and _in2 as input streams Returning from Ebwt constructor Headers: len: 254412 bwtLen: 254413 sz: 63603 bwtSz: 63604 lineRate: 6 offRate: 2 offMask: 0xfffffffc ftabChars: 10 eftabLen: 20 eftabSz: 80 ftabLen: 1048577 ftabSz: 4194308 offsLen: 63604 offsSz: 254416 lineSz: 64 sideSz: 64 sideBwtSz: 48 sideBwtLen: 192 numSides: 1326 numLines: 1326 ebwtTotLen: 84864 ebwtTotSz: 84864 color: 0 reverse: 1 Total time for backward call to driver() for mirror index: 00:00:00 Building a SMALL index No index, query, or output file specified! Bowtie 2 version 2.2.3 by Ben Langmead (langmea@cs.jhu.edu, www.cs.jhu.edu/~langmea) Usage: bowtie2 [options]* -x {-1 -2 | -U } [-S ]

Index filename prefix (minus trailing .X.bt2). NOTE: Bowtie 1 and Bowtie 2 indexes are not compatible. Files with #1 mates, paired with files in . Could be gzip'ed (extension: .gz) or bzip2'ed (extension: .bz2). Files with #2 mates, paired with files in . Could be gzip'ed (extension: .gz) or bzip2'ed (extension: .bz2). Files with unpaired reads. Could be gzip'ed (extension: .gz) or bzip2'ed (extension: .bz2). File for SAM output (default: stdout) , , can be comma-separated lists (no whitespace) and can be specified many times. E.g. '-U file1.fq,file2.fq -U file3.fq'. Options (defaults in parentheses): Input: -q query input files are FASTQ .fq/.fastq (default) --qseq query input files are in Illumina's qseq format -f query input files are (multi-)FASTA .fa/.mfa -r query input files are raw one-sequence-per-line -c , , are sequences themselves, not files -s/--skip skip the first reads/pairs in the input (none) -u/--upto stop after first reads/pairs (no limit) -5/--trim5 trim bases from 5'/left end of reads (0) -3/--trim3 trim bases from 3'/right end of reads (0) --phred33 qualities are Phred+33 (default) --phred64 qualities are Phred+64 --int-quals qualities encoded as space-delimited integers Presets: Same as: For --end-to-end: --very-fast -D 5 -R 1 -N 0 -L 22 -i S,0,2.50 --fast -D 10 -R 2 -N 0 -L 22 -i S,0,2.50 --sensitive -D 15 -R 2 -N 0 -L 22 -i S,1,1.15 (default) --very-sensitive -D 20 -R 3 -N 0 -L 20 -i S,1,0.50 For --local: --very-fast-local -D 5 -R 1 -N 0 -L 25 -i S,1,2.00 --fast-local -D 10 -R 2 -N 0 -L 22 -i S,1,1.75 --sensitive-local -D 15 -R 2 -N 0 -L 20 -i S,1,0.75 (default) --very-sensitive-local -D 20 -R 3 -N 0 -L 20 -i S,1,0.50 Alignment: -N max # mismatches in seed alignment; can be 0 or 1 (0) -L length of seed substrings; must be >3, <32 (22) -i interval between seed substrings w/r/t read len (S,1,1.15) --n-ceil func for max # non-A/C/G/Ts permitted in aln (L,0,0.15) --dpad include extra ref chars on sides of DP table (15) --gbar disallow gaps within nucs of read extremes (4) --ignore-quals treat all quality values as 30 on Phred scale (off) --nofw do not align forward (original) version of read (off) --norc do not align reverse-complement version of read (off) --no-1mm-upfront do not allow 1 mismatch alignments before attempting to scan for the optimal seeded alignments --end-to-end entire read must align; no clipping (on) OR --local local alignment; ends might be soft clipped (off) Scoring: --ma match bonus (0 for --end-to-end, 2 for --local) --mp max penalty for mismatch; lower qual = lower penalty (6) --np penalty for non-A/C/G/Ts in read/ref (1) --rdg , read gap open, extend penalties (5,3) --rfg , reference gap open, extend penalties (5,3) --score-min min acceptable alignment score w/r/t read length (G,20,8 for local, L,-0.6,-0.6 for end-to-end) Reporting: (default) look for multiple alignments, report best, with MAPQ OR -k report up to alns per read; MAPQ not meaningful OR -a/--all report all alignments; very slow, MAPQ not meaningful Effort: -D give up extending after failed extends in a row (15) -R for reads w/ repetitive seeds, try sets of seeds (2) Paired-end: -I/--minins minimum fragment length (0) -X/--maxins maximum fragment length (500) --fr/--rf/--ff -1, -2 mates align fw/rev, rev/fw, fw/fw (--fr) --no-mixed suppress unpaired alignments for paired reads --no-discordant suppress discordant alignments for paired reads --no-dovetail not concordant when mates extend past each other --no-contain not concordant when one mate alignment contains other --no-overlap not concordant when mates overlap at all Output: -t/--time print wall-clock time taken by search phases --un write unpaired reads that didn't align to --al write unpaired reads that aligned at least once to --un-conc write pairs that didn't align concordantly to --al-conc write pairs that aligned concordantly at least once to (Note: for --un, --al, --un-conc, or --al-conc, add '-gz' to the option name, e.g. --un-gz , to gzip compress output, or add '-bz2' to bzip2 compress output.) --quiet print nothing to stderr except serious errors --met-file send metrics to file at (off) --met-stderr send metrics to stderr (off) --met report internal counters & metrics every secs (1) --no-head supppress header lines, i.e. lines starting with @ --no-sq supppress @SQ header lines --rg-id set read group id, reflected in @RG line and RG:Z: opt field --rg add ("lab:value") to @RG line of SAM header. Note: @RG line only printed when --rg-id is set. --omit-sec-seq put '*' in SEQ and QUAL fields for secondary alignments. Performance: -p/--threads number of alignment threads to launch (1) --reorder force SAM output order to match order of input reads --mm use memory-mapped I/O for index; many 'bowtie's can share Other: --qc-filter filter out reads that are bad according to QSEQ filter --seed seed for random number generator (0) --non-deterministic seed rand. gen. arbitrarily instead of using read attributes --version print version information and quit -h/--help print this usage message (ERR): bowtie2-align exited with value 1
sejmodha commented 10 years ago

Hi Sergey,

Following on from the MAPREADS question, older and newer version of bowtie gives me following error. Log from MAPREADS.log: Settings: Output files: "/home/modh01s/Gavin/3855/3855_metamos_64bowtie/Assemble/out/idba-ud.64.IDX..ebwt" Line rate: 6 (line is 64 bytes) Lines per side: 1 (side is 64 bytes) Offset rate: 2 (one in 4) FTable chars: 10 Strings: unpacked Max bucket size: default Max bucket size, sqrt multiplier: default Max bucket size, len divisor: 4 Difference-cover sample period: 1024 Endianness: little Actual local endianness: little Sanity checking: disabled Assertions: disabled Random seed: 0 Sizeofs: void_:8, int:4, long:8, size_t:8 Input files DNA, FASTA: /home/modh01s/Gavin/3855/3855_metamos_64_bowtie/Assemble/out/idba-ud.64.asm.contig Reading reference sizes Time reading reference sizes: 00:00:00 Calculating joined length Writing header Reserving space for joined string Joining reference sequences Time to join reference sequences: 00:00:00 bmax according to bmaxDivN setting: 63605 Using parameters --bmax 47704 --dcv 1024 Doing ahead-of-time memory usage test Passed! Constructing with these parameters: --bmax 47704 --dcv 1024 Constructing suffix-array element generator Building DifferenceCoverSample Building sPrime Building sPrimeOrder V-Sorting samples V-Sorting samples time: 00:00:00 Allocating rank array Ranking v-sort output Ranking v-sort output time: 00:00:00 Invoking Larsson-Sadakane on ranks Invoking Larsson-Sadakane on ranks time: 00:00:00 Sanity-checking and returning Building samples Reserving space for 12 sample suffixes Generating random suffixes QSorting 12 sample offsets, eliminating duplicates QSorting sample offsets, eliminating duplicates time: 00:00:00 Multikey QSorting 12 samples (Using difference cover) Multikey QSorting samples time: 00:00:00 Calculating bucket sizes Binary sorting into buckets 10% 20% 30% 40% 50% 60% 70% 80% 90% 100% Binary sorting into buckets time: 00:00:00 Splitting and merging Splitting and merging time: 00:00:00 Split 1, merged 7; iterating... Binary sorting into buckets 10% 20% 30% 40% 50% 60% 70% 80% 90% 100% Binary sorting into buckets time: 00:00:00 Splitting and merging Splitting and merging time: 00:00:00 Avg bucket size: 36344.9 (target: 47703) Converting suffix-array elements to index image Allocating ftab, absorbFtab Entering Ebwt loop Getting block 1 of 7 Reserving size (47704) for bucket Calculating Z arrays Calculating Z arrays time: 00:00:00 Entering block accumulator loop: 10% 20% 30% 40% 50% 60% 70% 80% 90% 100% Block accumulator loop time: 00:00:00 Sorting block of length 39784 (Using difference cover) Sorting block time: 00:00:00 Returning block of 39785 Getting block 2 of 7 Reserving size (47704) for bucket Calculating Z arrays Calculating Z arrays time: 00:00:00 Entering block accumulator loop: 10% 20% 30% 40% 50% 60% 70% 80% 90% 100% Block accumulator loop time: 00:00:00 Sorting block of length 40413 (Using difference cover) Sorting block time: 00:00:00 Returning block of 40414 Getting block 3 of 7 Reserving size (47704) for bucket Calculating Z arrays Calculating Z arrays time: 00:00:00 Entering block accumulator loop: 10% 20% 30% 40% 50% 60% 70% 80% 90% 100% Block accumulator loop time: 00:00:00 Sorting block of length 38633 (Using difference cover) Sorting block time: 00:00:00 Returning block of 38634 Getting block 4 of 7 Reserving size (47704) for bucket Calculating Z arrays Calculating Z arrays time: 00:00:00 Entering block accumulator loop: 10% 20% 30% 40% 50% 60% 70% 80% 90% 100% Block accumulator loop time: 00:00:00 Sorting block of length 20490 (Using difference cover) Sorting block time: 00:00:00 Returning block of 20491 Getting block 5 of 7 Reserving size (47704) for bucket Calculating Z arrays Calculating Z arrays time: 00:00:00 Entering block accumulator loop: 10% 20% 30% 40% 50% 60% 70% 80% 90% 100% Block accumulator loop time: 00:00:00 Sorting block of length 35359 (Using difference cover) Sorting block time: 00:00:00 Returning block of 35360 Getting block 6 of 7 Reserving size (47704) for bucket Calculating Z arrays Calculating Z arrays time: 00:00:00 Entering block accumulator loop: 10% 20% 30% 40% 50% 60% 70% 80% 90% 100% Block accumulator loop time: 00:00:00 Sorting block of length 42921 (Using difference cover) Sorting block time: 00:00:00 Returning block of 42922 Getting block 7 of 7 Reserving size (47704) for bucket Calculating Z arrays Calculating Z arrays time: 00:00:00 Entering block accumulator loop: 10% 20% 30% 40% 50% 60% 70% 80% 90% 100% Block accumulator loop time: 00:00:00 Sorting block of length 36814 (Using difference cover) Sorting block time: 00:00:00 Returning block of 36815 Exited Ebwt loop fchr[A]: 0 fchr[C]: 55464 fchr[G]: 127676 fchr[T]: 198672 fchr[$]: 254420 Exiting Ebwt::buildToDisk() Returning from initFromVector Wrote 4268853 bytes to primary EBWT file: /home/modh01s/Gavin/3855/3855_metamos_64_bowtie/Assemble/out/idba-ud.64.IDX.1.ebwt Wrote 254428 bytes to secondary EBWT file: /home/modh01s/Gavin/3855/3855_metamos_64_bowtie/Assemble/out/idba-ud.64.IDX.2.ebwt Re-opening _in1 and _in2 as input streams Returning from Ebwt constructor Headers: len: 254420 bwtLen: 254421 sz: 63605 bwtSz: 63606 lineRate: 6 linesPerSide: 1 offRate: 2 offMask: 0xfffffffc isaRate: -1 isaMask: 0xffffffff ftabChars: 10 eftabLen: 20 eftabSz: 80 ftabLen: 1048577 ftabSz: 4194308 offsLen: 63606 offsSz: 254424 isaLen: 0 isaSz: 0 lineSz: 64 sideSz: 64 sideBwtSz: 56 sideBwtLen: 224 numSidePairs: 568 numSides: 1136 numLines: 1136 ebwtTotLen: 72704 ebwtTotSz: 72704 reverse: 0 Total time for call to driver() for forward index: 00:00:01 Reading reference sizes Time reading reference sizes: 00:00:00 Calculating joined length Writing header Reserving space for joined string Joining reference sequences Time to join reference sequences: 00:00:00 bmax according to bmaxDivN setting: 63605 Using parameters --bmax 47704 --dcv 1024 Doing ahead-of-time memory usage test Passed! Constructing with these parameters: --bmax 47704 --dcv 1024 Constructing suffix-array element generator Building DifferenceCoverSample Building sPrime Building sPrimeOrder V-Sorting samples V-Sorting samples time: 00:00:00 Allocating rank array Ranking v-sort output Ranking v-sort output time: 00:00:00 Invoking Larsson-Sadakane on ranks Invoking Larsson-Sadakane on ranks time: 00:00:00 Sanity-checking and returning Building samples Reserving space for 12 sample suffixes Generating random suffixes QSorting 12 sample offsets, eliminating duplicates QSorting sample offsets, eliminating duplicates time: 00:00:00 Multikey QSorting 12 samples (Using difference cover) Multikey QSorting samples time: 00:00:00 Calculating bucket sizes Binary sorting into buckets 10% 20% 30% 40% 50% 60% 70% 80% 90% 100% Binary sorting into buckets time: 00:00:00 Splitting and merging Splitting and merging time: 00:00:00 Split 1, merged 6; iterating... Binary sorting into buckets 10% 20% 30% 40% 50% 60% 70% 80% 90% 100% Binary sorting into buckets time: 00:00:01 Splitting and merging Splitting and merging time: 00:00:00 Split 1, merged 1; iterating... Binary sorting into buckets 10% 20% 30% 40% 50% 60% 70% 80% 90% 100% Binary sorting into buckets time: 00:00:00 Splitting and merging Splitting and merging time: 00:00:00 Avg bucket size: 31801.6 (target: 47703) Converting suffix-array elements to index image Allocating ftab, absorbFtab Entering Ebwt loop Getting block 1 of 8 Reserving size (47704) for bucket Calculating Z arrays Calculating Z arrays time: 00:00:00 Entering block accumulator loop: 10% 20% 30% 40% 50% 60% 70% 80% 90% 100% Block accumulator loop time: 00:00:00 Sorting block of length 16276 (Using difference cover) Sorting block time: 00:00:00 Returning block of 16277 Getting block 2 of 8 Reserving size (47704) for bucket Calculating Z arrays Calculating Z arrays time: 00:00:00 Entering block accumulator loop: 10% 20% 30% 40% 50% 60% 70% 80% 90% 100% Block accumulator loop time: 00:00:00 Sorting block of length 46460 (Using difference cover) Sorting block time: 00:00:00 Returning block of 46461 Getting block 3 of 8 Reserving size (47704) for bucket Calculating Z arrays Calculating Z arrays time: 00:00:00 Entering block accumulator loop: 10% 20% 30% 40% 50% 60% 70% 80% 90% 100% Block accumulator loop time: 00:00:00 Sorting block of length 25231 (Using difference cover) Sorting block time: 00:00:00 Returning block of 25232 Getting block 4 of 8 Reserving size (47704) for bucket Calculating Z arrays Calculating Z arrays time: 00:00:00 Entering block accumulator loop: 10% 20% 30% 40% 50% 60% 70% 80% 90% 100% Block accumulator loop time: 00:00:00 Sorting block of length 26040 (Using difference cover) Sorting block time: 00:00:00 Returning block of 26041 Getting block 5 of 8 Reserving size (47704) for bucket Calculating Z arrays Calculating Z arrays time: 00:00:00 Entering block accumulator loop: 10% 20% 30% 40% 50% 60% 70% 80% 90% 100% Block accumulator loop time: 00:00:00 Sorting block of length 24338 (Using difference cover) Sorting block time: 00:00:00 Returning block of 24339 Getting block 6 of 8 Reserving size (47704) for bucket Calculating Z arrays Calculating Z arrays time: 00:00:00 Entering block accumulator loop: 10% 20% 30% 40% 50% 60% 70% 80% 90% 100% Block accumulator loop time: 00:00:00 Sorting block of length 40940 (Using difference cover) Sorting block time: 00:00:00 Returning block of 40941 Getting block 7 of 8 Reserving size (47704) for bucket Calculating Z arrays Calculating Z arrays time: 00:00:00 Entering block accumulator loop: 10% 20% 30% 40% 50% 60% 70% 80% 90% 100% Block accumulator loop time: 00:00:00 Sorting block of length 39250 (Using difference cover) Sorting block time: 00:00:00 Returning block of 39251 Getting block 8 of 8 Reserving size (47704) for bucket Calculating Z arrays Calculating Z arrays time: 00:00:00 Entering block accumulator loop: 10% 20% 30% 40% 50% 60% 70% 80% 90% 100% Block accumulator loop time: 00:00:00 Sorting block of length 35878 (Using difference cover) Sorting block time: 00:00:00 Returning block of 35879 Exited Ebwt loop fchr[A]: 0 fchr[C]: 55464 fchr[G]: 127676 fchr[T]: 198672 fchr[$]: 254420 Exiting Ebwt::buildToDisk() Returning from initFromVector Wrote 4268853 bytes to primary EBWT file: /home/modh01s/Gavin/3855/3855_metamos_64_bowtie/Assemble/out/idba-ud.64.IDX.rev.1.ebwt Wrote 254428 bytes to secondary EBWT file: /home/modh01s/Gavin/3855/3855_metamos_64_bowtie/Assemble/out/idba-ud.64.IDX.rev.2.ebwt Re-opening _in1 and _in2 as input streams Returning from Ebwt constructor Headers: len: 254420 bwtLen: 254421 sz: 63605 bwtSz: 63606 lineRate: 6 linesPerSide: 1 offRate: 2 offMask: 0xfffffffc isaRate: -1 isaMask: 0xffffffff ftabChars: 10 eftabLen: 20 eftabSz: 80 ftabLen: 1048577 ftabSz: 4194308 offsLen: 63606 offsSz: 254424 isaLen: 0 isaSz: 0 lineSz: 64 sideSz: 64 sideBwtSz: 56 sideBwtLen: 224 numSidePairs: 568 numSides: 1136 numLines: 1136 ebwtTotLen: 72704 ebwtTotSz: 72704 reverse: 0 Total time for backward call to driver() for mirror index: 00:00:01 Reads file contained a pattern with more than 1024 quality values. Please truncate reads and quality values and and re-run Bowtie terminate called after throwing an instance of 'int' /bin/bash: line 1: 32317 Aborted /home/modh01s/metAMOS-1.5rc3/Utilities/cpp/Linux-x86_64/bowtie -p 6 -l 25 -e 140 --best --strata -m 10 -k 1 --un /home/modh01s/Gavin/3855/3855_metamos_64_bowtie/Assemble/out/idba-ud.64.lib1.unaligned.seq /home/modh01s/Gavin/3855/3855_metamos_64_bowtie/Assemble/out/idba-ud.64.IDX /home/modh01s/Gavin/3855/3855_metamos_64_bowtie/Preprocess/out/lib1.seq > /home/modh01s/Gavin/3855/3855_metamos_64_bowtie/Assemble/out/lib1.bout

jvollme commented 10 years ago

Hi, just as a feedback: In my case updating bowtie2 to the newest version seems to have helped (at least the pipeline did not abort at the "mapreads" stage anymore).

So I , at least, can't confirm that the problem lies with the new bowtie2-version.

(I do however have another issue, as the pipeline now aborts at the scaffolding step and there seem to be issues with the Metavelvet coverages, but I'll open a new Issue for that)

skoren commented 10 years ago

semojha, I don't think you can use bowtie as it doesn't support assemblies larger than 4GB for indexing. Only bowtie2 versions 2.2 on support this functionality. It looks like older versions of Bowtie2 didn't require the -x option to specify the index but 2.2.3 does. I verified that 2.2.2 still works so I'd recommend using that version. Otherwise, you can update the lines in src/mapreads.py which call bowtie2 (lines 211 and 213 to have -x in front of the IDX file).