Open tanaes opened 6 years ago
Hi, thanks for the report!
Could you tell me a bit more about the target computer(s) this fails on? What is the memory (RAM) and CPU?
I'll look into this! Gabe
On Thu, Apr 19, 2018, 8:45 PM Jon Sanders notifications@github.com wrote:
Hey guys,
We've been trying to track down a problem while adapting SHOGUN to Qiita, the symptom of which was finding this message when running integration tests in Travis:
- File "/home/travis/build/qiita-spots/qp-shotgun/miniconda3/envs/qp-shotgun/lib/python3.5/site-packages/pandas/core/groupby.py", line 2934, in _get_grouper
- raise KeyError(gpr)
- KeyError: 'summary'
@antgonza https://github.com/antgonza also was having the same error on his OS X install, but neither I (on Barnacle) nor @semarpetrus https://github.com/semarpetrus (on his Linux box) were encountering it.
Running SHOGUN directly using the following commands yielded a good alignment + downstream files on Barnacle:
aln_out=foo.align database=/home/jgsanders/git_sw/qp-shotgun/qp_shotgun/shogun/databases/shogun level=species aligner=burst threads=8 profile=profile.tsv aln_out_fp=foo.align/alignment.burst.b6 redistributed="profile.${level}.tsv" fun_output=functional
shogun align \ --aligner ${aligner} \ --threads ${threads} \ --database ${database} \ --input combined.fna \ --output ${aln_out}
shogun assign_taxonomy \ --aligner ${aligner} \ --database ${database} \ --input ${aln_out_fp} \ --output ${profile}
shogun redistribute \ --database ${database} \ --level ${level} \ --input ${profile} \ --output ${redistributed}
fun_level=$level shogun functional \ --database ${database} \ --input ${profile} \ --output ${fun_output} \ --level ${fun_level}
where the test database is here https://github.com/antgonza/qp-shotgun/blob/shogun/qp_shotgun/shogun/databases/shogun.tar.bz2 and the input data are here https://www.dropbox.com/s/ocu4c0ft8vhbjwx/combined.fna?dl=0
Running the same align command on an OS X box (using Gabe's supplied burst15 binary) ran for a bit and then produced an empty .b6 output file.
Running BURST directly on the OS X box produced the following output:
burst15 --references qp_shotgun/shogun/databases/shogun/burst/5min.edx --queries combined.fna --output test.b6 --accelerator qp_shotgun/shogun/databases/shogun/burst/5min.acx This is BURST [v0.99.7LL] --> Using accelerator file qp_shotgun/shogun/databases/shogun/burst/5min.acx Using up to AVX-128 with 8 threads. --> [Accel] Accelerator found. Parsing... --> [Accel] Total accelerants: 805949 [bytes = 2106932] --> [Accel] Reading 0 ambiguous entries
EDB database provided. Parsing... --> EDB: Fingerprints are DISABLED --> EDB: Parsing compressed headers --> EDB: Sheared database (shear size = 515) --> EDB: 970 refs [970 orig], 61 clumps, 1030 maxR Parsed 400000 queries (0.071752). Calculating minMax... Found min 150, max 150 (0.000109). Converting queries... Converted (0.007549) Copying queries... Copied (0.002561) Sorting queries... Sorted (0.088294) Copying indices... Copied (0.001531) Determining uniqueness... Done (0.007544). Number unique: 397338 Collecting unique sequences... Done (0.001721) Creating data structures... Done (0.004528) [maxED: 4] Determining query ambiguity... Determined (0.023589) Creating bins... Created (0.011927); Unambig: 391663, ambig: 5675, super-ambig: 0 [5675,397338,397338] Re-sorting... Re-sorted (0.194431) Calculating divergence... Calculated (0.009815) [10.120026 avg div; 150 max] Fingerprints not enabled Setting QBUNCH to 16 Using ACCELERATOR to align 397338 unique queries... Search Progress: [100.00%] Search complete. Consolidating results... Segmentation fault: 11
What do you think?
— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/knights-lab/SHOGUN/issues/18, or mute the thread https://github.com/notifications/unsubscribe-auth/AHrXBvdct9NKb_Ie48fOmdPloFzcherFks5tqS-8gaJpZM4Tcs1z .
In travis, we are get between 4 GB and 7.5 GB. Note that we are using Sudo-enabled builds more info.
Locally, I have a MacBookPro14,3, with 16 GB
@tanaes SHOGUN doesn't pick up the failed signal from BURST? Python's subprocess call should log it.
Under default parameters, it gave no output to STDOUT or STDERR, just produced an empty alignment file.
What command was used to build the database? Also, does the attached linux binary (compiled from the same code used to compile the Mac binary) work on your high-RAM linux systems? Trying to rule out database creation commands as well as differences in code since the older existing linux version.
I ran
burst15 -r 5min.fna -a 5min.acx -o 5min.edx -d DNA -s
Then aligned with
burst15 -r 5min.edx -a 5min.acx -q combined.fna -o test.b6
According to my run with /usr/bin/time -v, this took 12GB of RAM to run. Insufficient RAM might then explain the travis failure, but it's unclear what's causing the Mac failure (unless you had over 4GB consumed by other programs at runtime, leaving less than 12GB for burst15).
BURST15 will always reserve ~8GB (the size of the index table in the "database15" mode, adjusted for number of threads) plus the size of the database itself (minimum 4GB), so it'll yank 12GB to run (burst12 can run in under 128MB so that's the one recommended for laptops!).
Thanks! I'll let @tanaes answer those specific questions. Just out of curiosity, will 15/12 yield the same results? Either way, what are the differences?
@GabeAl The attached binary does indeed segfault on our high memory linux machine. Here's the output (here, the ./burst15 is the one attached above):
☕ barnacle:qp-shotgun $ ./burst15 \
> --references qp_shotgun/shogun/databases/shogun/burst/5min.edx \
> --queries combined.fna \
> --output test.b6 \
> --accelerator qp_shotgun/shogun/databases/shogun/burst/5min.acx
This is BURST [v0.99.7LL]
--> Using accelerator file qp_shotgun/shogun/databases/shogun/burst/5min.acx
Using up to AVX-128 with 24 threads.
--> [Accel] Accelerator found. Parsing...
--> [Accel] Total accelerants: 805949 [bytes = 2106932]
--> [Accel] Reading 0 ambiguous entries
EDB database provided. Parsing...
--> EDB: Fingerprints are DISABLED
--> EDB: Parsing compressed headers
--> EDB: Sheared database (shear size = 515)
--> EDB: 970 refs [970 orig], 61 clumps, 1030 maxR
Parsed 400000 queries (0.089528). Calculating minMax...
Found min 150, max 150 (0.000125).
Converting queries... Converted (0.007726)
Copying queries... Copied (0.004054)
Sorting queries... Sorted (0.125254)
Copying indices... Copied (0.000616)
Determining uniqueness... Done (0.004894). Number unique: 397338
Collecting unique sequences... Done (0.001327)
Creating data structures... Done (0.006473) [maxED: 4]
Determining query ambiguity... Determined (0.012322)
Creating bins... Created (0.012095); Unambig: 391663, ambig: 5675, super-ambig: 0 [5675,397338,397338]
Re-sorting... Re-sorted (0.322825)
Calculating divergence... Calculated (0.007467) [10.120026 avg div; 150 max]
Fingerprints not enabled
Setting QBUNCH to 16
Using ACCELERATOR to align 397338 unique queries...
Search Progress: [100.00%]
Search complete. Consolidating results...
Segmentation fault (core dumped)
☕ barnacle:qp-shotgun $ ls
burst15 combined.fna LICENSE qp_shotgun README.rst scripts setup.py support_files test test.b6
☕ barnacle:qp-shotgun $ ~/miniconda/envs/oecophylla-shogun/bin/burst15 \
> --references qp_shotgun/shogun/databases/shogun/burst/5min.edx \
> --queries combined.fna \
> --output test.b6 \
> --accelerator qp_shotgun/shogun/databases/shogun/burst/5min.acx
This is BURST [v0.99.7f]
--> Using accelerator file qp_shotgun/shogun/databases/shogun/burst/5min.acx
Using up to AVX-128 with 24 threads.
--> [Accel] Accelerator found. Parsing...
--> [Accel] Total accelerants: 805949 [bytes = 2106932]
--> [Accel] Reading 0 ambiguous entries
EDB database provided. Parsing...
--> EDB: Fingerprints are DISABLED
--> EDB: Parsing compressed headers
--> EDB: Sheared database (shear size = 515)
--> EDB: 970 refs [970 orig], 61 clumps, 1030 maxR
Parsed 400000 queries (0.085349). Calculating minMax...
Found min 150, max 150 (0.000108).
Converting queries... Converted (0.007505)
Copying queries... Copied (0.004179)
Sorting queries... Sorted (0.131057)
Copying indices... Copied (0.006557)
Determining uniqueness... Done (0.006628). Number unique: 397338
Collecting unique sequences... Done (0.005024)
Creating data structures... Done (0.007195) [maxED: 4]
Determining query ambiguity... Determined (0.018151)
Creating bins... Created (0.016560); Unambig: 391663, ambig: 5675, super-ambig: 0 [5675,397338,397338]
Re-sorting... Re-sorted (0.340644)
Calculating divergence... Calculated (0.007354) [10.120026 avg div; 150 max]
Fingerprints not enabled
Setting QBUNCH to 16
Using ACCELERATOR to align 397338 unique queries...
Search Progress: [100.00%]
Search complete. Consolidating results...
CAPITALIST: Processed 329 investments
Alignment time: 42.566155 seconds
What's the difference, again, between burst12 and burst15? Does the database need to be reindexed for one vs the other?
This is indeed interesting. Could you share the commandline that was used to make the burst database? It seems to differ from what I used here: burst15 -r 5min.fna -a 5min.acx -o 5min.edx -d DNA -s
In any case, there may be a combination bug that arises from some mix of DB commandline and the most recent changes to CAPITALIST (and/or tallying reads in general).
A couple questions to help me hone in:
As for the difference between burst12 and burst15, burst12 is primarly intended for amplicon databases. It uses a much more RAM-friendly indexing scheme for small databases. For large (>4GB) databases, burst15 is recommended for speed.
As such, while the "edx" will work fine between the two versions, the "acx" is specific to one or the other (whichever version was used to make it).
Awesome, thanks for the clarification. I’ll try remaking the database and see how it goes. On Mon, Apr 23, 2018 at 12:35 PM Gabriel Al-Ghalith < notifications@github.com> wrote:
As for the difference between burst12 and burst15, burst12 is primarly intended for amplicon databases. It uses a much more RAM-friendly indexing scheme for small databases. For large (>4GB) databases, burst15 is recommended for speed.
As such, while the "edx" will work fine between the two versions, the "acx" is specific to one or the other (whichever version was used to make it).
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/knights-lab/SHOGUN/issues/18#issuecomment-383696043, or mute the thread https://github.com/notifications/unsubscribe-auth/AH6JAPuUnEBqGIqIAi-EKk-kK7o4wVC7ks5trizigaJpZM4Tcs1z .
Were you able to solve your problem by rebuilding the database?
I ran into a similar issue. I wasn't able to get SHOGUN working with burst, since the latest official release of burst, v0.99.8, didn't even compile on my Linux machine (the source release contains syntax errors!).
So I installed bowtie2 and I ran SHOGUN with --aligner bowtie2
. It kept crunching for about 18 minutes (htop
was showing that the bowtie2
process was running), then I got the KeyError: 'summary'
exception from Python. I don't know if bowtie2 segfaulted though.
The source likely doesn't contain syntax errors, it just requires the Intel compiler and architecture-specific optimization flags because of the assembly instructions included.
It is highly, highly recommended to grab the prebuilt binary for BURST from the Releases section of the repo.
Thanks, Gabe
On Wed, Apr 29, 2020 at 4:22 AM Árpád Goretity notifications@github.com wrote:
I ran into a similar issue. I wasn't able to get SHOGUN working with burst, since the latest official release of burst, v0.99.8, didn't even compile on my Linux machine (the source release contains syntax errors!).
So I installed bowtie2 and I ran SHOGUN with --aligner bowtie2. It kept crunching for about 18 minutes (htop was showing that the bowtie2 process was running), then I got the KeyError: 'summary' exception from Python. I don't know if bowtie2 segfaulted though.
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/knights-lab/SHOGUN/issues/18#issuecomment-621059959, or unsubscribe https://github.com/notifications/unsubscribe-auth/AB5NOBSNIDIP6F3NXRJ45MTRO7PVDANCNFSM4E3SZVZQ .
[edit] D'oh, I found the test files in the very first post! I'm assuming you're using the same ones. I don't have a Mac, but maybe I can spin up a VM to test this.
What's the memory on the machine you're running it on?
Thanks a bunch, Gabe
On Thu, Apr 30, 2020 at 12:46 PM Gabe A. gabextreme@gmail.com wrote:
The source likely doesn't contain syntax errors, it just requires the Intel compiler and architecture-specific optimization flags because of the assembly instructions included.
It is highly, highly recommended to grab the prebuilt binary for BURST from the Releases section of the repo.
Thanks, Gabe
On Wed, Apr 29, 2020 at 4:22 AM Árpád Goretity notifications@github.com wrote:
I ran into a similar issue. I wasn't able to get SHOGUN working with burst, since the latest official release of burst, v0.99.8, didn't even compile on my Linux machine (the source release contains syntax errors!).
So I installed bowtie2 and I ran SHOGUN with --aligner bowtie2. It kept crunching for about 18 minutes (htop was showing that the bowtie2 process was running), then I got the KeyError: 'summary' exception from Python. I don't know if bowtie2 segfaulted though.
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/knights-lab/SHOGUN/issues/18#issuecomment-621059959, or unsubscribe https://github.com/notifications/unsubscribe-auth/AB5NOBSNIDIP6F3NXRJ45MTRO7PVDANCNFSM4E3SZVZQ .
Also, what were the commands run to produce the database itself? Databases aren't compatible across major BURST releases.
What's the difference, again, between burst12 and burst15? Does the database need to be reindexed for one vs the other?
Yes. DB15 and DB12 have fundamentally different database structures. Also, major releases of BURST (lettered are minor, numbered are major) also may have incompatibilities. I think this should be detected if an older database or a database made with a different DB version of BURST is used. I believe later versions of burst (i.e. newer than the 0.97 series) will do this detection automatically, but perhaps Shogun should implement this check in the wrapper first, or warn if pointing to a DB it knows it shipped with an earlier version.
DB12 is for low-RAM alignment. It is slower, and primarily intended for amplicons. Burst15 is for higher-RAM alignment and intended for shotgun. This is vaguely similar to the difference between bowtie2-align-s and bowtie2-align-l, which are also non-interchangeable, but the python wrapper "bowtie2" sorts out which should be called with which.
@GabeAl Hey, no, thank you for getting back to this!
Just to bring this in context, I'm familiar with building C code from source. It's not an unsupported assembly extension: the syntax error in particular I noticed was a missing closing curly brace here. After I added the closing curly on the next line, the compiler went ahead and complained about a type error here which is an assignment of a QPod
to a value of type QPod *
; judging from the surrounding code, it's probably a missing dereference. Then there is the redeclaration of numBins
, RefCache
and StCache
here. I could imagine that the latter one is something the Intel compiler accepts. After I removed those, the code compiled just fine using -march=native
with GCC 7.1. (I must admit, it might not do what it is doing under the Intel cc, though.)
I have since tried SHOGUN with the Linux binary downloadable from the same release (which advertises itself as burst15), with no success, unfortunately. Based on what several others suggested above, it might very well be that I simply don't have enough RAM; I'll be able to check this possibility soon, once I have access to a beefier machine. I have 8 GB in my Linux box, which seems to be close but no cigar.
The databases I didn't build myself, I simply downloaded the pre-built ones as suggested by the very last paragraph of this part of the README.
Cheers, Árpád
Thanks H2CO3!
Oh I see -- the current source indeed looks like it's for a WIP version and updates stopped after that. Later versions (completing the WIP, going into the 0.99.8 series, etc) must have never gotten pushed. I will push my local copy up.
Done. Let me know.
Cheerio, Gabe
Awesome, thanks for that!
Hey guys,
We've been trying to track down a problem while adapting SHOGUN to Qiita, the symptom of which was finding this message when running integration tests in Travis:
@antgonza also was having the same error on his OS X install, but neither I (on Barnacle) nor @semarpetrus (on his Linux box) were encountering it.
Running SHOGUN directly using the following commands yielded a good alignment + downstream files on Barnacle:
where the test database is here and the input data are here
Running the same align command on an OS X box (using Gabe's supplied burst15 binary) ran for a bit and then produced an empty
.b6
output file.Running BURST directly on the OS X box produced the following output:
What do you think?