macmanes-lab / Oyster_River_Protocol

Official Repository of the Oyster River Protocol for Transcriptome Assembly
Creative Commons Zero v1.0 Universal
16 stars 14 forks source link

The _salmon.XXXX.stderr file size balloons out of control #50

Open jemcquillan opened 1 year ago

jemcquillan commented 1 year ago

When running ORP 2.3.3, the _salmon.XXXX.stderr gets stuck in a seemingly endless loop reaching terabytes in size. Thus completely filling up my HPC partition allotted space.

This issue causing terabytes being used for certain CBins in the read_partitions directory, which should be a few megabytes, is that the Building BooPHF is caught in a seemingly endless loop, constantly posting the elapsed time end estimated finishing time, being printed to the _salmon.XXXX.stderr file.

I have attached a PDF overview of the issue and the "hacky" fix. This is something to look into. It stems from a Salmon dependency BBHash - https://github.com/rizkg/BBHash

Specifically, the BooPHF.h script - https://github.com/rizkg/BBHash/blob/master/BooPHF.h

Thank you for looking into it.

ORP-2.3.3 Ballooning issue..pdf

jemcquillan commented 1 year ago

This may also be outside the ORP's purview and requires a cross-ref with BBHash.