Closed sammyjava closed 1 year ago
@adf-ncgr @cann0010 @sdash-github
Looks OK to me. I could imagine adding back in a list of sra_run accessions, but I think making sra_experiment the primary unit in this context (as nf-core/fetchngs also seems to do) makes sense given what NCBI says here : https://www.ncbi.nlm.nih.gov/sra/docs/submitmeta/#sra-metadata-experiment
An SRA EXPERIMENT is the main publishable unit in the SRA database.
using SRX seems to support cases where multiple runs are done to increase the sequencing depth of a single biorep; I think @sdash-github wasn't in the stand-up meeting where we discussed this briefly, so just mentioning it again.
we should also have some way of representing replicate groups in the samples file, since we have at least one atlas (cowpea) that has them and they will likely be increasingly common as we move to more diverse and less ancient experiments.
A replicate_group column would probably be sufficient for the foreseeable future, with equal (non-null) values between rows indicating that they belong to the same group.
This isn't a huge change, but I propose having the samples file contain the following columns: