Open taltman opened 4 years ago
I would probably not extend the set of HMMs so rapidly and instead go for more "iterative" approach:
Spike_torovirin
modelNote that 1. above (set of CoV genomes vs Pfam) is already effectively done by https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6295324/ There is a set of 93 HMMs available. We just need to take the newer versions of them, if available (it was based on Pfam release 31 and we're at 33 these days).
Also, we're already having #227 here...
@asl @rcedgar
To help coronaSPAdes identify CoV-associated contigs from the full assembly graph, we need to expand our set of target HMMs.
Versions:
The next version could run the full Pfam HMM library against the following sequences:
cov3ma
)cov3ma
+ other Nidovirales genomesAlso, we should figure out whether to run
hmmsearch
with max sensitivity--max -E 0.01
, or be conservative and use--cut_ga
.