Mismatches between janno and ena tables

poseidon-framework / community-archive

The Poseidon Community Archive (PCA)

https://www.poseidon-adna.org/#/archive_overview

10 stars 25 forks source link

Mismatches between janno and ena tables #108

Open TCLamnidis opened 1 year ago

TCLamnidis commented 1 year ago

It is sometimes the case that individuals that appear in the janno of a package do not appear in the ENA table for the package (if, say, some of the data was not properly uploaded to the ENA), or vice versa (e.g. when individuals were excluded from analyses and supplementary tables of a paper, but the sequencing data was still uploaded to the ENA).

This will pose a challenge for automatic processing of packages in the future.

stschiff commented 1 year ago

Yes, we explicitly allowed that for now, knowing the imperfections of the ENA data basis. I agree we need to eventually be stricter on this. So let's leave this issue as a reminder.

93Boy commented 1 year ago

Poseidon to ENA has a many-to-many relationship. Sometimes there are multiple ENA entries to a single Poseidon ID (genotype, mt data etc...) , another scenario is some samples don't available in ENA.