Closed schorlton closed 2 years ago
I saw 0 homologous sequence need to download:
. This is very likely due to no related genomes in your sketch satisfying the 95% ANI threshold with the consensus.fasta. Are you sure they are related species? Though the threshold can be lowered, we rarely do so unless polishing highly-mutated viruses.
--mash_threshold MASH_THRESHOLD
Mash output threshold. [0.95]
No, I am not sure! I think knowing that there is a close genome to your draft assembly for polishing is a very strong assumption. For example, if I want to polish a metagenome or unknown bacterial isolate, then I need to pre-check that my organism(s) are contained in NCBI.
Instead, can I suggest that Homopolish just outputs unpolished contigs if there are no close matches with a warning? I was under the impression that it did this already.
You are right. We should prompt the user and output this way. This should be of higher priority. Will revise it next week.
We updated the code on Github which would output the unpolished contigs and prompt the user when no sufficient related genomes (<5) are found. Thanks for your helpful suggestions.
Amazing! Thanks for being so quick and receptive. I see you bumped the version to 0.3.3 - could I just get you to tag that and perhaps even trigger the push to bioconda? Thanks again!!
v0.3.3 has been tagged and pushed to bioconda, which should be available soon.
Thank you!!
Thanks again for the great tool. I ran Homopolish 0.3.2 using the command
python3 homopolish.py polish -a consensus.fasta -m R9.4.pkl -s refseq.msh -o output
. Oddly, it doesn't produce any output (ie no FASTA file inoutput
) but looks like it succeeded?I'm attaching consensus.fasta.gz, and the mash screen can be found here.
The last couple lines of the log file look like: