broadinstitute / long-read-pipelines

Long read production pipelines
https://broadinstitute.github.io/long-read-pipelines/
BSD 3-Clause "New" or "Revised" License
133 stars 24 forks source link

Task for routine trio-assembly quality assessment #53

Open SHuang-Broad opened 4 years ago

SHuang-Broad commented 4 years ago

Now that we showed trio-assembly costs can be lowered to sub $10, it makes sense to have a (sub-) pipeline for quality assessment on routine trio-assembly.

Currently, I'm experimenting with BUSCO and U50. Other suggestions/ideas welcome.

SHuang-Broad commented 4 years ago

QUAST, a standard assembly eval tool, has several evaluation stages that have different compute demands.

For example, the contig analyzer step has huge memory burden—peak memory usage has been observed to be as high as 180GB for a 7-assembly assessment job.

Therefore it makes sense to see if it is possible to separate different stages out and expose API/CLI for that.

SHuang-Broad commented 4 years ago

Scrolling through the QUAST mis-assembly reports, I'm beginning to ask the question—should there be "best practices" for de novo diploid assemblies.