Open wchow opened 1 year ago
Hey,
Sorry for the delayed response.
Hmm... this is a new one 🤔... I think this is an error to do with a process taking up to much memory that is then killed by the OS.
I'll say that hafeZ isn't well suited for running on metagenomic data, as it was designed for use on individual genomes, so i think its freaking out cos it found too many ROIs.
I'd recommend sorting your metagenome into MAGs and then try and rerun hafeZ on each of the MAGs as hafeZ's calculations are also expecting single genomes. For best results I'd also probably recommend mapping reads to the MAGs ahead of time and using only those reads that mapped to the MAG as input for hafeZ.
'True' metagenome functionality it's something I'd like to add in the future but won't be adding for a wee while.
Hope that helps!
Hi @Chrisjrt ,
Thanks for the information, I figured it might me trying to throw the kitchen sink into the tool. Thanks I'll have a bit of a think as well. Out of curiosity, are there any cases where a drop in coverage from the baseline can indicate an insertion event (like if it is low/rare event). Thanks!
Hi,
Cheers on creating hafeZ. I was trying to run this on my own dataset but encounted a python multiprocessing error during the step "Calculating median Z for each roi".
For context I'm running against a metagenome sample of roughly 38k contigs, mapping with around 70K PE illumina reads. My runtime command is:
The output error is:
Any idea what could be the issue. Do I need to downsample?
thanks again for your help.
Will