sanger-tol / genomenote

Nextflow DSL2 pipeline to generate a Genome Note, including assembly statistics, quality metrics, and Hi-C contact maps. This workflow is part of the Tree of Life production suite.
https://pipelines.tol.sanger.ac.uk/genomenote
MIT License
24 stars 6 forks source link

Add BUSCO to the annotation_statistics subworkflow #141

Closed BethYates closed 4 days ago

BethYates commented 2 months ago

Add BUSCO to the annotations statistics subworkflow. To do this you can use the existing nf-core busco module https://nf-co.re/modules/busco/ and run BUSCO in proteins mode. You will need to pass it protein fasta files. Currently the sub-workflow only has a GFF file of the protein annotations, we can use this to produce the fasta files by using https://nf-co.re/modules/gffread/ and the genomic fasta file.

The BUSCO score should be added to the final file output by the subworkflow.

muffato commented 4 days ago

Closed by #142