Arcadia-Science / seqqc

A Nextflow pipeline to identify quality control issues with new sequencing data.
MIT License
28 stars 0 forks source link

probably not a good idea to contain SRR accessions in the database #14

Closed taylorreiter closed 1 year ago

taylorreiter commented 1 year ago

While the code to do this is really cool, it's highly likely that the SRR accessions are contaminated themselves (e.g. with kit contam!) and so we probably shouldn't add them to the database. I bet we would frequently match against them and it would be hard to interpret what that actually means.

We could potentially filter to just really highly abundant stuff, but even then we might get a bunch of adapters or phix or something.

taylorreiter commented 1 year ago

closed by linked merge.