legumeinfo / datastore-specifications

Specifications for directory naming, file naming, file contents in the LIS datastore
2 stars 0 forks source link

Proposal: simplified samples file for expression collections #27

Closed sammyjava closed 1 year ago

sammyjava commented 1 year ago

This isn't a huge change, but I propose having the samples file contain the following columns:

  1. identifier e.g. SAMN02226091 (anything unique, could also be a GEO sample, for example)
  2. name e.g. Leaf_Young
  3. description e.g. Fully expanded 2nd trifoliate leaf tissue from plants provided with fertilizer
  4. treatment e.g. Normal growing conditions
  5. tissue e.g. leaf
  6. development_stage e.g. V2 - second trifoliate
  7. species e.g. Phaseolus vulgaris
  8. genotype e.g. G19833
  9. replicate_group e.g. 1A, leave blank if no reps
  10. biosample e.g. SAMN02226068
  11. sra_experiment e.g. SRX695793
sammyjava commented 1 year ago

@adf-ncgr @cann0010 @sdash-github

adf-ncgr commented 1 year ago

Looks OK to me. I could imagine adding back in a list of sra_run accessions, but I think making sra_experiment the primary unit in this context (as nf-core/fetchngs also seems to do) makes sense given what NCBI says here : https://www.ncbi.nlm.nih.gov/sra/docs/submitmeta/#sra-metadata-experiment

An SRA EXPERIMENT is the main publishable unit in the SRA database.

using SRX seems to support cases where multiple runs are done to increase the sequencing depth of a single biorep; I think @sdash-github wasn't in the stand-up meeting where we discussed this briefly, so just mentioning it again.

adf-ncgr commented 1 year ago

we should also have some way of representing replicate groups in the samples file, since we have at least one atlas (cowpea) that has them and they will likely be increasingly common as we move to more diverse and less ancient experiments.

adf-ncgr commented 1 year ago

A replicate_group column would probably be sufficient for the foreseeable future, with equal (non-null) values between rows indicating that they belong to the same group.