cerebis / bin3C

Extract metagenome-assembled genomes (MAGs) from metagenomic data using Hi-C.
GNU Affero General Public License v3.0
23 stars 7 forks source link

Memory Error while running in VirtualBox #32

Open cerebis opened 4 years ago

cerebis commented 4 years ago

Dear cerebis, I am having a similar problem. I managed to generate the contact map but it's 58 Mb and when I run it in virtualbox (ubuntu) I got the same error: MemoryError.

Could you specify how to decrease the size of the ContactMap? During the assembly by itself or by adding any specific flag in the commands?

Thank you very much for your attention.

Originally posted by @davidcalfran in https://github.com/cerebis/bin3C/issues/17#issuecomment-604319263

cerebis commented 4 years ago

@davidcalfran.

It will be hard to advise you from the information you've supplied.

bin3C is efficient in its use of memory, but metagenomic analysis can be resource hungry.

The size of the contact map is dictated by the size of the problem, but the problem can be made smaller by increasing the minimum length of accepted reference when making the map (--min-reflen). The default is 1000 bp, but you could try doubling that to see if your problem fits into available memory.

Just in case you decreased --min-reflen, keep in mind that in my validation work of bin3C, contigs less than 1kbp do not add a great deal to the completeness of MAGs but do increase the required memory. Further, the commonly seen long tail of small contigs in metagenomic assemblies also means that lowering this threshold can drastically increase the problem size.

Failing to efficiently making use of short contigs (<< 1kbp) is a known limitation of bin3C, but 1kbp seems to be a common lower threshold for genome binning software.

[ed] fixed a few statements for clarity and typos