nf-core / pathogensurveillance

Surveillance of pathogens using population genomics and sequencing
https://nf-co.re/pathogensurveillance
MIT License
11 stars 5 forks source link

Large and complex test data set #92

Open zachary-foster opened 1 month ago

zachary-foster commented 1 month ago

A test data set that combines the use of as many features of the pipeline as possible and imitates data from a diagnostic clinic setting. Perhaps including the following:

Representative pathogenic species for each of the following taxa:

Between 1 and 100 samples for each species using both long and short read sequencing. Most can be 1-5 samples, but it would be nice to have one species with 100ish samples.

Columns with fake data that imitate a diagnostic clinic setting:

There should be reports for each client as well as a report containing all samples