greenelab / 2022-microberna

A pipeline to generate a compendia of bacterial and archaeal RNA-seq data
BSD 3-Clause "New" or "Revised" License
4 stars 1 forks source link

switch from experiment accession to run accession for rnaseq samples #7

Closed taylorreiter closed 2 years ago

taylorreiter commented 2 years ago

I used the experiment accession numbers to label files because that's what's used in the Pa compendium, but as I go to add more samples to the Pa compendia, it's a bit of a pain. I think the run accession is more standard, so before running this on 60k samples, should be switched over.

taylorreiter commented 2 years ago

run accession is also the only one that doesn't inappropriately collapse different libraries. (I found examples where sample accession and experiment accession both do) The trade off is that a single library may have multiple accessions (especially for early libraries uploaded to the SRA). Shrug...these could always be post-processed by a user to add counts across libraries. Not a perfect solution, but I think it's good enough.