Output report discussion

mahesh-panchal commented 10 months ago

Aim

We need a report to summarise output.

What should be in the report?

Need versioning.
We should have a way to highlight what's changed since the last report.

Can this all be in MultiQC, or do we need a Quarto or something else report?

Decisions

The report creation is automated by the workflow.

mahesh-panchal commented 10 months ago

Repo here: https://github.com/NBISweden/assembly-project-template

mahesh-panchal commented 10 months ago

One suggestion so far is that, we have a MultiQC that only has the statistics for the most up to date analysis ( which may not be the latest run ), and a Quarto website that has all relevant reruns and exploratory decisions.

mahesh-panchal commented 10 months ago

Quarto can also be used to publish pages to Confluence Cloud. I.e. We could have a space on the Wiki that has reports for all the organisms.

mahesh-panchal commented 9 months ago

MultiQC supports:

Busco
Kraken
Quast
STAR

Hmm. MQC has only a few supported modules, but I'm positive this is used to summarise stuff in some assembly Galaxy workflows.

There are some open requests on Merqury and GFAstats and the like, but these are not listed in supported tools.

gbdias commented 9 months ago

Thoughts on a final report (for delivery to users).

It is useful to have the report generate a methods paragraph for assembly and curation.
It could be something like the Sanger DToL reports but include software version already in the text instead of in a separate table.
The results stats collected need only focus on the EBP criteria.
This report is for finished projects so it's generation would have need to be triggered after decisions have been made on which versions of each analyses will be delivered. In this sense it would be its own subworkflow not executed by default.

mahesh-panchal commented 4 months ago

Thoughts on a final report (for delivery to users).

* It is useful to have the report generate a methods paragraph for assembly and curation.

* It could be something like the Sanger DToL reports but include software version already in the text instead of in a separate table.
  ![Screenshot 2023-09-21 at 12 12 48](https://private-user-images.githubusercontent.com/7614153/269577058-0aac145b-472e-4318-bb54-61d9d4611ea5.png?jwt=eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJpc3MiOiJnaXRodWIuY29tIiwiYXVkIjoicmF3LmdpdGh1YnVzZXJjb250ZW50LmNvbSIsImtleSI6ImtleTUiLCJleHAiOjE3MDg0MjMzNTUsIm5iZiI6MTcwODQyMzA1NSwicGF0aCI6Ii83NjE0MTUzLzI2OTU3NzA1OC0wYWFjMTQ1Yi00NzJlLTQzMTgtYmI1NC02MWQ5ZDQ2MTFlYTUucG5nP1gtQW16LUFsZ29yaXRobT1BV1M0LUhNQUMtU0hBMjU2JlgtQW16LUNyZWRlbnRpYWw9QUtJQVZDT0RZTFNBNTNQUUs0WkElMkYyMDI0MDIyMCUyRnVzLWVhc3QtMSUyRnMzJTJGYXdzNF9yZXF1ZXN0JlgtQW16LURhdGU9MjAyNDAyMjBUMDk1NzM1WiZYLUFtei1FeHBpcmVzPTMwMCZYLUFtei1TaWduYXR1cmU9MWQ1YWMyNDU5ZmJjYmIwZDU0N2NmM2MyYzcxZGMyMDkzNGMwNWIwOWVjZTRhODM3YTc1MjUxMGUyNmY4YjY3NSZYLUFtei1TaWduZWRIZWFkZXJzPWhvc3QmYWN0b3JfaWQ9MCZrZXlfaWQ9MCZyZXBvX2lkPTAifQ.Tjlaagkrm79r-tsPKdL7_nUppMj1hNavJnDRA-nSbM0)

* The results stats collected need only focus on the EBP criteria.
  ![Screenshot 2023-09-21 at 12 22 37](https://private-user-images.githubusercontent.com/7614153/269578977-e1ad7089-8998-4a22-b0e5-5d96117c3ef9.png?jwt=eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJpc3MiOiJnaXRodWIuY29tIiwiYXVkIjoicmF3LmdpdGh1YnVzZXJjb250ZW50LmNvbSIsImtleSI6ImtleTUiLCJleHAiOjE3MDg0MjMzNTUsIm5iZiI6MTcwODQyMzA1NSwicGF0aCI6Ii83NjE0MTUzLzI2OTU3ODk3Ny1lMWFkNzA4OS04OTk4LTRhMjItYjBlNS01ZDk2MTE3YzNlZjkucG5nP1gtQW16LUFsZ29yaXRobT1BV1M0LUhNQUMtU0hBMjU2JlgtQW16LUNyZWRlbnRpYWw9QUtJQVZDT0RZTFNBNTNQUUs0WkElMkYyMDI0MDIyMCUyRnVzLWVhc3QtMSUyRnMzJTJGYXdzNF9yZXF1ZXN0JlgtQW16LURhdGU9MjAyNDAyMjBUMDk1NzM1WiZYLUFtei1FeHBpcmVzPTMwMCZYLUFtei1TaWduYXR1cmU9MDhlODk3ZmQzMDg5NmNjNmQ1OTljZWEwMmYwMWEzMDI2NDU3MmY5NThlZDA1MzFiMjZhNWI3YWQ1YzhkZjA0YSZYLUFtei1TaWduZWRIZWFkZXJzPWhvc3QmYWN0b3JfaWQ9MCZrZXlfaWQ9MCZyZXBvX2lkPTAifQ.CH4sP88UmqZTVn4SHXkCqDZ6obs_3XRG7zyUyWjuu0A)

* This report is for finished projects so it's generation would have need to be triggered after decisions have been made on which versions of each analyses will be delivered. In this sense it would be its own subworkflow not executed by default.

The screenshot images appear to be dead links now.

mahesh-panchal commented 4 months ago

I figured out we can also make sections for multiqc in quarto so they can be loaded in. There are a few more details to take care of, but then we get a multiqc report. Proof-of-concept: https://github.com/mahesh-panchal/nextflow-fastqc-quarto

NBISweden / Earth-Biogenome-Project-pilot

Output report discussion #43

Aim

Decisions