poseidon-framework / poseidon-schema

An archaeogenetic genotype data organisation file format
0 stars 1 forks source link

[Review recommendation] Make Poseidon explicitly species-agnostic #80

Open nevrome opened 4 months ago

nevrome commented 4 months ago

This recommendation was raised in the review of the Poseidon paper.

In principle, it seems the framework could be species-agnostic and thus be useful more generally beyond humans (perhaps it would be enough to add just one more "species" metadata field?). It is of course up to the authors to decide how broadly they want to cater.

nevrome commented 4 months ago

As discussed in the past it's a pretty big commitment to officially support multiple different species. The main issues are to main understanding:

  1. Some .janno columns (e.g. Ploidy) are specifically designed with human samples in mind and it would be difficult to make them sufficiently general.
  2. Storing samples from different species in one Poseidon package may cause unexpected, nonsensical behaviour, e.g. upon merging.

My intuition would be to continue to focus on and only officially support human genomes, but I'm not fully decided.

stschiff commented 4 months ago

I think it makes sense. We don't have to support it it in the archives (or rather: force it to be human), but people could still use Poseidon Packages for dogs/wolves/sheep etc. Ploidy is actually already species-agnostic. That's a general biology term and is meaningful for all higher organisms.

So I think it would be a very easy win to simply add a field "species" and specify to be the Latin name, so "Homo sapiens" and friends. It's a bit unclear then how to deal with Neandertals and Denisovans, but perhaps we can then simply allow "Neanderthal" and "Denisovan" as an exception. Or, even easier "Archaic hominin".

nevrome commented 4 months ago

OK - I suggest we go through all columns of the .janno file once more and check for potential ramifications. I also think we should talk to some colleagues actively working with animal data.