Open jennaj opened 2 months ago
Deeptools use a wide variety of memory allocations per individual tool, but I have increased it to 14GB on .org for now.
Thanks @natefoo - when will the updated RAM allocation become activated? As soon as I saw this, I got excited and reran the same job and got OOM failure - info page has no details on allocation.
Here's a shared history with the failed job at the top if that helps.
It is, the job had 14GB to work with. Forgive my ignorance, but something as simple as creating bigwig files from a sorted bed files should not be using more than a couple MB of memory. Could you use something like https://github.com/deeptools/pyBigWig?tab=readme-ov-file#adding-entries-to-a-bigwig-file ? I'd also be relatively surprised if deeptools didn't use pyBigWig since Devon is the author of both.
On Thu, 12 Sept 2024 at 02:03, Ross Lazarus @.***> wrote:
Thanks @natefoo https://github.com/natefoo - when will the updated RAM allocation become activated? Reran the same job and got exactly same OOM failure - info page has no details on how much RAM was allocated so cannot tell if the change has been activated. Here's a shared history with the failed job https://usegalaxy.org/u/fubar/h/dimer-density-and-hap-coverage-aug17-hg002 at the top if that helps.
— Reply to this email directly, view it on GitHub https://github.com/galaxyproject/usegalaxy-tools/issues/831#issuecomment-2344998430, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABT5LJOUGLGZRLSYNCZEAJLZWDK4NAVCNFSM6AAAAABN7OT3V6VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDGNBUHE4TQNBTGA . You are receiving this because you are subscribed to this thread.Message ID: @.***>
Forgive my ignorance, but something as simple as creating bigwig files from a sorted bed files should not be using more than a couple MB of memory.
@mvdbeek: Thanks, and yes, there's a little more to it than that - all the short tandem repeats in 3-4GB of fasta need to be found first. Birds seem to have a lot of them. Will take a look, but it works without any problem as is on EU, with the same data, with whatever set up they have there - so at least we know it can be made to work.
Used to be able to see maximum ram use on the information page, but no longer unfortunately. There is also a backstory. JBrowse2 refuses to read bigwigs created by pybigtools (not pybigwig - a more recent package) although they worked fine with other ucsc bigwig tools and browsers - so I gave up and replaced that with ucsc-bedgraphtobigwig. But that might not be the problem. There's a new version of the STR finder so I'll bump that FWIW.
@mvdbeek: Configured not to make a bigwig, the most recent failure in that history also went OOM so making the bigwig is not the source of the OOM problem - that just ran pytrf. 14GB and still OOM without bigwig - yet it runs ok on EU where info says it was allocated 8GB FWIW. There's a (now merged) PR for a pytrf version upgrade - will test when that propogates to .org
tl;dr Looks like EU's 8GB is highly elastic and the OOM killer is sleepy. The pytrf STR finder package uses 32GB or more to generate these VGP dimer tracks. Replacing the bigwig generator code as suggested will probably not help.
@natefoo @jennaj @mvdbeek: Just ran the failing .org job command line WITHOUT producing a bigwig for the TC dimers for a 2.9GB fasta, and watched top - it rose steadily and hit 31.5GB toward the end of the run.
python find_str.py --fasta /evol/html/jbrowse/combined_hg002.fa --bed test.bed --specific 'TC' --monomin '100' --dimin '1' --trimin '50' --tetramin '50' --pentamin '50' --hexamin '50'
The resulting bed is 2.5GB which is why the tool offers bigwig output - the beds for dimer density are far too busy - but bigwigs are ~200MB so they load faster.
psrecord gives a lower estimate for RAM useage - not sure why the difference from top - this with 3 second samples.
Reran the CL to make a bigwig instead of a bed and used 20 second samples so coarser but seemed to peak at slightly less RAM - so bigwig is not the RAM problem.
If the issue is not bigwig I don't see a problem with allocating more memory. Thanks for checking @fubar2 !
I'll give it 58GB, Runtime is pretty short anyway, but if you feel like it, https://github.com/fubar2/microsatbed/blob/main/find_str.py#L51 seems like it could be parallelized effecively, and you could write out bed entries to disk instead of collecting them in-memory in a list. Also keep in mind that list comprehensions are eager, all the memory is going to be allocated, and since you use new variables the old variable can also not be collected. A good old for loop would be considerably better.
@mvdbeek - thanks - agreed - done - comprehensions were replaced and that's what was used for those tests yesterday. PR now in. Yes, it's serial by contig so could be parallel - but runtime is not a problem and it needs someone with those skills.
@mvdbeek: That worked for that job - thanks!
Can make it go OOM with a small adjustment to the parameters. Will retest those other parameters once the new version is available with less list comprehension bloat, but let's see how it goes in VGP workflows once the other tools are working.
Is this one still good at 58 GB?
All good today - need to check the figures to see how much they really used...
http://toolshed.g2.bx.psu.edu/repos/iuc/microsatbed/microsatbed/1.3.2+galaxy1
Ross @fubar2 suggests increasing to be the same as Deeptools since it also makes use of ucsc-bedgraphtobigwig.