DessimozLab / FastOMA

FastOMA is a scalable software package to infer orthology relationship.
Mozilla Public License 2.0
27 stars 7 forks source link

Report step uses very large amount of memory #23

Closed ens-sb closed 3 months ago

ens-sb commented 5 months ago

Hello,

I have run FastOMA on about 2200 genomes. The inference of HOGs finished all right (with some tweaks in the resource config), however the fastoma_report step failed as it consumes a very large amount of memory (did not fit in 1200GBs!). I just wanted to let you know about this issue in the case it is possible to improve the memory usage.

Regards, Botond

sinamajidian commented 5 months ago

Dear Botond

Thanks for reporting this. I was able to reproduce the error, which rooted in seaborn's sns.displot used for visualisation of protein length distribution in fastoma_notebook_stat.ipynb. Sorry for the inconveniences. We are fixing this and will update the notebook soon.

Best, Sina

sinamajidian commented 3 months ago

We decided not to include this visualization for big datasets and limit it to 100 species simply using a condition if len(species_list)<100: in the jupyter report file. Thanks again.