GenomicsStandardsConsortium / mixs

Minimum Information about any (X) Sequence” (MIxS) specification
https://w3id.org/mixs
Creative Commons Zero v1.0 Universal
38 stars 21 forks source link

Consider relaxing `isol_growth_condt` requirement in MIGSBacteria (etc.) #595

Open jfy133 opened 1 year ago

jfy133 commented 1 year ago

Context:

In ancient DNA studies, a popular field is to recover ancient pathogen genomes (e.g. Yersinia pestis), i.e. the genomes of dead organisms where much of the cellular biomass has disintegrated. However the degraded DNA of the microbial cells can still be preserved (e.g. bound to skeletal mineral) and short-read reference based mapping reconstruction of genome-length consensus sequences are possible (where de novo assembly often will not work due to the very short reads).

There is a debate in MInAS where such a sample/sequence fit into the MIxS schema. My initial reaction was that it would go into MIGSBacteria (as it's a single genome, not a whole metagenome).

Problem

an aDNA genome is recovered from degraded/dead organisms, whereas MIGSBacteria is defined as from a 'cultured bacteria/archaea', and isol_growth_condt is a required term. In both cases this cannot apply to dead natural organisms (i.e., we avoid synthetic genomes etc.).

Possible solutions

  1. Change definitions of existing definitions/terms
    • relax the definition say of a 'single genomic sequence of' rather than from a 'cultured' bacteira/archaea?
      • I could also envision other researchers taking similar approaches to aDNA of short-read mapping, e.g. when working with strains where there are only small amounts of genomic change that do not require de novo assembly
    • relax the isol_growth_condt from required to recommended to allow the use of aDNA researchers of the MIGSBacteria
  2. Create a whole new checklist for non-cultured genomic sequences (that are not de novo assembled, i.e., that does not fit in MIMAG)
    • seems like overkill to me 😬