Closed cagaser closed 4 years ago
The BK-trees are becoming extremely deep, so the stack will overflow from the recursive calls for traversing the BK-trees. This should be able to be solved by increasing the stack size (I see you have already increased the memory size) by using -Xss1G
or something larger/smaller depending on your task/available memory. The default stack size is not suitable for most tasks, especially when you handling extremely large files. Of course, there could be an infinite loop bug, but we will only know that if you try the program with a larger stack size.
If memory size becomes an issue, but a (probably much) slower speed is fine, try --data ngram
(no tree pointers overhead) or --data bktree
(UMIs are not duplicated) instead.
I am curious though, how many reads/unique UMIs do you have? It seems like an extremely large amount. Were other tools able to handle that data?
Hi, thank you for your reply.
I just checked the number of unique UMI's and I have around 4.3 million. I was using JE-suite and UMI-tools as well. However, I was having the same issue with high memory footprint. So, I was hoping I could try this tool. When I was using UMI-tools, the tool stopped at one location with roughly 4600 reads. I was using 180G memory back then.
The tool is running perfectly after using --data ngram
and -Xss1G
.
Thank you very much for your help!
Hey, can you please try it with just -Xss1G
and no --data ngram
, so it is using the default n-gram BK-trees data structure? I want to see if there is still a stack overflow error, because that may indicate a bug in the code. For my experiments, I have deduplicated 1 million+ UMIs with just the 16 GB of memory on my laptop.
Hey Daniel; Just did and it's working aswell without --data ngram
Alright. Glad it works.
Hello,
I'm running the tool as follow:
But I'm getting the following error: