Nextflow DSL2 pipeline to generate a Genome Note, including assembly statistics, quality metrics, and Hi-C contact maps. This workflow is part of the Tree of Life production suite.
We use Busco to assess the completeness & fragmentation of an assembly. The main parameter is a lineage name, that needs to be phylogenetically relevant to the species being tested. Our implementations are inconsistent:
BlobToolKit computes Busco against all lineages that are parent of the species, and Rich provides the closest one to Karen.
Description of feature
We use Busco to assess the completeness & fragmentation of an assembly. The main parameter is a lineage name, that needs to be phylogenetically relevant to the species being tested. Our implementations are inconsistent:
eutheria_odb10
rather than the more specific ones (rodents, primates, etc): https://github.com/sanger-tol/genomenote/blob/1.0.0/bin/get_odb.py#L35-L36We need to agree on one rule and use it throughout all pipelines.