nf-core / genomeqc

Compare the quality of multiple genomes, along with their annotations.
https://nf-co.re/genomeqc
MIT License
2 stars 8 forks source link

Using `FASTA_GXF_BUSCO_PLOT ` sub workflow #77

Open GallVp opened 2 weeks ago

GallVp commented 2 weeks ago

Hello Team!

I built a reusable sub workflow for performing BUSCO on genome alone, or on genome and annotation. I have used it in two other pipelines. It is currently sitting in my own modules repo: https://github.com/GallVp/nxf-components/blob/7188d37139f8dbdfd73cbfb5b0e3f811d1993392/subworkflows/gallvp/fasta_gxf_busco_plot/main.nf#L7

I think we should be sharing this across pipelines. I can submit it to nf-core/modules and add it here. Checking if there is interest here and if there is need to modify it to suite the needs of this pipeline?

chriswyatt1 commented 1 week ago

That looks really useful, will check it out next week.

FernandoDuarteF commented 1 week ago

I think we should definitively incorporate this to our pipeline.

I was also thinking we should make genome.nf and genome_and_annotation.nf into workflows, as right now they are subworkflows.

FernandoDuarteF commented 5 days ago

We should also have a look at the BUSCO subworkflow from the MAG pipeline. See #85.

GallVp commented 5 days ago

We should also have a look at the BUSCO subworkflow from the MAG pipeline. See #85.

My implementation does have a val_busco_lineages_path input. If this is provided, busco will download the data there once and will reuse it. If all the data is already downloaded, then busco will use that instead of redownloading it.

For cloud environments, this parameter needs to be optimised so that a tar ball can be passed similar to what mag is doing.