splatlab / squeakr

Squeakr: An Exact and Approximate k -mer Counting System
BSD 3-Clause "New" or "Revised" License
85 stars 23 forks source link

segfault *after* kmer counts complete? #19

Open ttriche opened 6 years ago

ttriche commented 6 years ago

So I'm under the impression that

1) mantis needs squeakr-exact to create .ser files 2) these .ser files can then be merged for querying and 3) squeakr-count is now multithreaded.

But this doesn't seem to work out so good in practice:

[tim.triche@node069 single]$ THREADS=`cat /proc/cpuinfo | grep proc | wc -l` 
[tim.triche@node069 single]$ echo $THREADS 
80
[tim.triche@node069 single]$ free -h
              total        used        free      shared  buff/cache   available
Mem:           250G         10G        239G         46M        907M        238G
Swap:           11G          0B         11G

Ok looks good. Now let's take an therapy-related AML patient's ancient RNAseq data and index it:

[tim.triche@node069 single]$ squeakr-count -g -k 31 -s 31 -t $THREADS SRR621698.fastq.gz 
Reading from the fastq file and inserting in the QF
Total Time Elapsed: 184.994003seconds
Calc freq distribution: 
Total Time Elapsed: 8.228049seconds
Maximum freq: 329368
Num distinct elem: 312966013
Total num elems: 2172228383
Segmentation fault

Woops? Any ideas for debugging and unit testing are appreciated, since I'd like to scale this up for various search types. Thanks for a great tool and your support in getting it to run smoothly :-)

ttriche commented 6 years ago

nb. The .ser file is actually a little bigger than the gzipped FASTQ. So maybe I can use fewer slots?

rob-p commented 6 years ago

Hi @ttriche,

Just wanted to let you know this is on our radar. We're preparing the camera-ready for mantis, which is due later this week, and should be able to look into this then. Thanks for all of your useful feedback and testing!

ttriche commented 6 years ago

Awesome, thanks much. I figured that might be the reason!

--t

On Sun, Jan 14, 2018 at 10:14 AM, Rob Patro notifications@github.com wrote:

Hi @ttriche https://github.com/ttriche,

Just wanted to let you know this is on our radar. We're preparing the camera-ready for mantis, which is due later this week, and should be able to look into this then. Thanks for all of your useful feedback and testing!

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/splatlab/squeakr/issues/19#issuecomment-357518445, or mute the thread https://github.com/notifications/unsubscribe-auth/AAARIujOW2HYVYleVv68pDs1RurriZmTks5tKhnsgaJpZM4RaOUS .

prashantpandey commented 6 years ago

Hi Tim,

I have added a script to the repo that can estimate the log of the number of slots needed as an argument to Squeakr. This script needs as input the output file generated by ntCard.

Now, you can always get a good enough estimate to decide the size of the CQF. Could you please try the script and run Squeakr again and see if the bug shows up again.

Thanks, Prashant

ttriche commented 6 years ago

Will do -- I was using ntCard as suggested for this, so I should already have that handy :-)

thank you! will report back.

--t

On Thu, Jan 25, 2018 at 2:21 PM, Prashant Pandey notifications@github.com wrote:

Hi Tim,

I have added a script to the repo that can estimate the log of the number of slots needed as an argument to Squeakr. This script needs as input the output file generated by ntCard.

Now, you can always get a good enough estimate to decide the size of the CQF. Could you please try the script and run Squeakr again and see if the bug shows up again.

Thanks, Prashant

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/splatlab/squeakr/issues/19#issuecomment-360571482, or mute the thread https://github.com/notifications/unsubscribe-auth/AAARIoVnD9kJjvLe8mraIKmCGAf5smiqks5tONQigaJpZM4RaOUS .

rtjohnso commented 6 years ago

Hi Tim,

Just checking in. Can we close this issue?

Best, Rob

ttriche commented 6 years ago

Let me check and I will update Monday. Sorry for the delay

--t

On Mar 2, 2018, at 1:38 PM, rtjohnso notifications@github.com wrote:

Hi Tim,

Just checking in. Can we close this issue?

Best, Rob

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub, or mute the thread.