SPAAM-community / AncientMetagenomeDir

Repository containing lists of all published ancient metagenomic (and related) samples and libraries
http://www.spaam-community.org/AncientMetagenomeDir/
Creative Commons Attribution 4.0 International
66 stars 30 forks source link

Multiple samples per library cause issue with AMDirT viewer #1133

Open ZoePochon opened 10 months ago

ZoePochon commented 10 months ago

Hi there! I selected all the ancient leprosy samples and downloaded the different tables and I noticed that Neukamm2020 was absent. I went back to AMDirT viewer and when you click on "Neukamm2020" (for the leprosy hit at least but maybe for the other one too) and click on the validate selection button, nothing happens. This might be a bug in AncientMetagenomeDir rather than AMDirT, I'm not sure.

jfy133 commented 10 months ago

Hm interesting... how did you load the viewer? Via CLI or the online version?

ZoePochon commented 10 months ago

Via the command line.

jfy133 commented 10 months ago

I can replicate, but there is nothing in the console... not sure what's happening

jfy133 commented 10 months ago

@maxibor (ignore previous now-deleted ping), do you know how we could debug this? Both the DOI and PRJ files resolve on the respective websites... (the only thing I could imagine being an issue)

maxibor commented 9 months ago

@jfy133 @ZoePochon I can't replicate, all is fine with me. Maybe it was a temporary ENA api disruption ?

jfy133 commented 8 months ago

@maxibor and I just tested again and he was looking at the wrong table, investigating gain

maxibor commented 8 months ago

Ok, finally poinpointed the true source of the issue. Turns out that some libraries were uploaded to ENA in a very unorthodox way: normally, there should be only one sample accession id (ERS...) per library (and multiple libraries per sample). We'll have to think about how to handle this, as it is not respecting the ENA data model.

Neukamm2020 2020    10.1186/s12915-020-00839-8  Abusir1630  ENA PRJEB33848  ERS3635976,ERS3636087,ERS3636088,ERS3636089 Abusir1630b double  Phusion Hot Start High-Fidelity DNA none    Illumina HiSeq 2500 PAIRED  WGS 1675402 ERR4374948  fastq_all   ftp.sra.ebi.ac.uk/vol1/fastq/ERR437/008/ERR4374948/ERR4374948_1.fastq.gz;ftp.sra.ebi.ac.uk/vol1/fastq/ERR437/008/ERR4374948/ERR4374948_2.fastq.gz   926be06d766f617d530dd86c2c923a98;ef3b1912569503c4bec4393c467af101   107051806;105518845
Neukamm2020 2020    10.1186/s12915-020-00839-8  Abusir1630  ENA PRJEB33848  ERS3635976,ERS3636087,ERS3636088,ERS3636089 Abusir1630UDG1  double  Phusion Hot Start High-Fidelity DNA full-udg    Illumina HiSeq 4000 SINGLE  WGS 22570691    ERR4375144  fastq_all   ftp.sra.ebi.ac.uk/vol1/fastq/ERR437/004/ERR4375144/ERR4375144.fastq.gz  bc31466e2ba83f27902b3e517bf739b3    785130427
Neukamm2020 2020    10.1186/s12915-020-00839-8  Abusir1630  ENA PRJEB33848  ERS3635976,ERS3636087,ERS3636088,ERS3636089 Abusir1630UDG2  double  Phusion Hot Start High-Fidelity DNA full-udg    Illumina HiSeq 4000 SINGLE  WGS 27168496    ERR4383828  fastq_all   ftp.sra.ebi.ac.uk/vol1/fastq/ERR438/008/ERR4383828/ERR4383828.fastq.gz  41ea60645b38e218d841674673dbfb47    962482754
Neukamm2020 2020    10.1186/s12915-020-00839-8  Abusir1630  ENA PRJEB33848  ERS3635976,ERS3636087,ERS3636088,ERS3636089 Abusir1630UDG3  double  Phusion Hot Start High-Fidelity DNA full-udg    Illumina HiSeq 4000 SINGLE  WGS 25875319    ERR4383829  fastq_all   ftp.sra.ebi.ac.uk/vol1/fastq/ERR438/009/ERR4383829/ERR4383829.fastq.gz  cabe84183a942517ed31bd06f815d7f0    917944806
Neukamm2020 2020    10.1186/s12915-020-00839-8  Abusir1543  ENA PRJEB33848  ERS3636018,ERS3636097,ERS3636099,ERS3636100,ERS3636098,ERS3636101,ERS3636093,ERS3636094,ERS3636095,ERS3636096,ERS3636025    Abusir1543s double  Phusion Hot Start High-Fidelity DNA none    Illumina HiSeq 2500 PAIRED  WGS 2920994 ERR4374011  fastq_all   ftp.sra.ebi.ac.uk/vol1/fastq/ERR437/001/ERR4374011/ERR4374011_1.fastq.gz;ftp.sra.ebi.ac.uk/vol1/fastq/ERR437/001/ERR4374011/ERR4374011_2.fastq.gz   e7ec3c7fb29e40d16710081d2feb8994;933b05c122f2c4899b7a0a470d7533fb   223979315;224820846
Neukamm2020 2020    10.1186/s12915-020-00839-8  Abusir1543  ENA PRJEB33848  ERS3636018,ERS3636097,ERS3636099,ERS3636100,ERS3636098,ERS3636101,ERS3636093,ERS3636094,ERS3636095,ERS3636096,ERS3636025    Abusir1543s_deeper_sequencing_1 double  Phusion Hot Start High-Fidelity DNA none    Illumina HiSeq 4000 PAIRED  WGS 7498748 ERR4386594  fastq_all   ftp.sra.ebi.ac.uk/vol1/fastq/ERR438/004/ERR4386594/ERR4386594_1.fastq.gz;ftp.sra.ebi.ac.uk/vol1/fastq/ERR438/004/ERR4386594/ERR4386594_2.fastq.gz   4c1532173ff5167f5f11722a51266c7e;882cf017a4d2ecba66596d28f5b098a8   470376358;513240544
Neukamm2020 2020    10.1186/s12915-020-00839-8  Abusir1543  ENA PRJEB33848  ERS3636018,ERS3636097,ERS3636099,ERS3636100,ERS3636098,ERS3636101,ERS3636093,ERS3636094,ERS3636095,ERS3636096,ERS3636025    Abusir1543s_deeper_sequencing_3 double  Phusion Hot Start High-Fidelity DNA none    Illumina HiSeq 4000 PAIRED  WGS 2094918 ERR4388102  fastq_all   ftp.sra.ebi.ac.uk/vol1/fastq/ERR438/002/ERR4388102/ERR4388102_1.fastq.gz;ftp.sra.ebi.ac.uk/vol1/fastq/ERR438/002/ERR4388102/ERR4388102_2.fastq.gz   b3a7f685a18bcc3e7ab40d0511400adc;7bb00d39f98a28f5a3fb619426a3696f   74381752;81080489
Neukamm2020 2020    10.1186/s12915-020-00839-8  Abusir1543  ENA PRJEB33848  ERS3636018,ERS3636097,ERS3636099,ERS3636100,ERS3636098,ERS3636101,ERS3636093,ERS3636094,ERS3636095,ERS3636096,ERS3636025    Abusir1543s_deeper_sequencing_4 double  Phusion Hot Start High-Fidelity DNA none    Illumina HiSeq 4000 PAIRED  WGS 2043769 ERR4388107  fastq_all   ftp.sra.ebi.ac.uk/vol1/fastq/ERR438/007/ERR4388107/ERR4388107_1.fastq.gz;ftp.sra.ebi.ac.uk/vol1/fastq/ERR438/007/ERR4388107/ERR4388107_2.fastq.gz   588ffe4bca41e2c72cca79959ca06ffd;266ffb4a7ce4c2d4f31343e7654ded18   73500339;86160084
Neukamm2020 2020    10.1186/s12915-020-00839-8  Abusir1543  ENA PRJEB33848  ERS3636018,ERS3636097,ERS3636099,ERS3636100,ERS3636098,ERS3636101,ERS3636093,ERS3636094,ERS3636095,ERS3636096,ERS3636025    Abusir1543s_deeper_sequencing_2 double  Phusion Hot Start High-Fidelity DNA none    Illumina HiSeq 4000 PAIRED  WGS 68309897    ERR4388231  fastq_all   ftp.sra.ebi.ac.uk/vol1/fastq/ERR438/001/ERR4388231/ERR4388231_1.fastq.gz;ftp.sra.ebi.ac.uk/vol1/fastq/ERR438/001/ERR4388231/ERR4388231_2.fastq.gz   acece4818ab416324da0cf63402b0a12;3c0efafafecda2de2e1ea5966546b5f5   2507959022;2979964842
Neukamm2020 2020    10.1186/s12915-020-00839-8  Abusir1543  ENA PRJEB33848  ERS3636018,ERS3636097,ERS3636099,ERS3636100,ERS3636098,ERS3636101,ERS3636093,ERS3636094,ERS3636095,ERS3636096,ERS3636025    Abusir1543s_deeper_sequencing_5 double  Phusion Hot Start High-Fidelity DNA none    Illumina HiSeq 4000 PAIRED  WGS 85981168    ERR4388236  fastq_all   ftp.sra.ebi.ac.uk/vol1/fastq/ERR438/006/ERR4388236/ERR4388236_1.fastq.gz;ftp.sra.ebi.ac.uk/vol1/fastq/ERR438/006/ERR4388236/ERR4388236_2.fastq.gz   66d091524a34f078c98572d36f533f02;0306df7f63723b81354b84658d7ca86b   3787231452;3997582586
Neukamm2020 2020    10.1186/s12915-020-00839-8  Abusir1543  ENA PRJEB33848  ERS3636018,ERS3636097,ERS3636099,ERS3636100,ERS3636098,ERS3636101,ERS3636093,ERS3636094,ERS3636095,ERS3636096,ERS3636025    Abusir1543b_deeper_sequencing_1 double  Phusion Hot Start High-Fidelity DNA none    Illumina HiSeq 4000 PAIRED  WGS 1151423 ERR4384927  fastq_all   ftp.sra.ebi.ac.uk/vol1/fastq/ERR438/007/ERR4384927/ERR4384927_1.fastq.gz;ftp.sra.ebi.ac.uk/vol1/fastq/ERR438/007/ERR4384927/ERR4384927_2.fastq.gz   c74c1c0c0a2ed8599bd6d461a2b7c633;f94e0d0abbf5d0cdfa025950450ece20   75049432;86890431
Neukamm2020 2020    10.1186/s12915-020-00839-8  Abusir1543  ENA PRJEB33848  ERS3636018,ERS3636097,ERS3636099,ERS3636100,ERS3636098,ERS3636101,ERS3636093,ERS3636094,ERS3636095,ERS3636096,ERS3636025    Abusir1543b_deeper_sequencing_2 double  Phusion Hot Start High-Fidelity DNA none    Illumina HiSeq 4000 PAIRED  WGS 3669179 ERR4385800  fastq_all   ftp.sra.ebi.ac.uk/vol1/fastq/ERR438/000/ERR4385800/ERR4385800_1.fastq.gz;ftp.sra.ebi.ac.uk/vol1/fastq/ERR438/000/ERR4385800/ERR4385800_2.fastq.gz   102a24929948a9c8a36fd97bc413a1db;ab3352a7943ac95861386137174be48b   232629449;265766157
Neukamm2020 2020    10.1186/s12915-020-00839-8  Abusir1543  ENA PRJEB33848  ERS3636018,ERS3636097,ERS3636099,ERS3636100,ERS3636098,ERS3636101,ERS3636093,ERS3636094,ERS3636095,ERS3636096,ERS3636025    Abusir1543b_deeper_sequencing_3 double  Phusion Hot Start High-Fidelity DNA none    Illumina HiSeq 4000 PAIRED  WGS 12607106    ERR4385809  fastq_all   ftp.sra.ebi.ac.uk/vol1/fastq/ERR438/009/ERR4385809/ERR4385809_1.fastq.gz;ftp.sra.ebi.ac.uk/vol1/fastq/ERR438/009/ERR4385809/ERR4385809_2.fastq.gz   6f41a2eb226ae091f5423fdc98391b04;96d6ba2c095cba635fdfff04a5963bc3   794112413;899207544
Neukamm2020 2020    10.1186/s12915-020-00839-8  Abusir1543  ENA PRJEB33848  ERS3636018,ERS3636097,ERS3636099,ERS3636100,ERS3636098,ERS3636101,ERS3636093,ERS3636094,ERS3636095,ERS3636096,ERS3636025    Abusir1543b_deeper_sequencing_4 double  Phusion Hot Start High-Fidelity DNA none    Illumina HiSeq 4000 PAIRED  WGS 72657115    ERR4386569  fastq_all   ftp.sra.ebi.ac.uk/vol1/fastq/ERR438/009/ERR4386569/ERR4386569_1.fastq.gz;ftp.sra.ebi.ac.uk/vol1/fastq/ERR438/009/ERR4386569/ERR4386569_2.fastq.gz   e7f72ba102c6ad019068109dc3cb6d19;b0ed5a76e625b70479a02af4e35a6042   2811157094;3327301183
Neukamm2020 2020    10.1186/s12915-020-00839-8  Abusir1543  ENA PRJEB33848  ERS3636018,ERS3636097,ERS3636099,ERS3636100,ERS3636098,ERS3636101,ERS3636093,ERS3636094,ERS3636095,ERS3636096,ERS3636025    Abusir1543b double  Phusion Hot Start High-Fidelity DNA none    Illumina HiSeq 2500 PAIRED  WGS 3702377 ERR4374008  fastq_all   ftp.sra.ebi.ac.uk/vol1/fastq/ERR437/008/ERR4374008/ERR4374008_1.fastq.gz;ftp.sra.ebi.ac.uk/vol1/fastq/ERR437/008/ERR4374008/ERR4374008_2.fastq.gz   6c48b43d0690b87dc343c185eb95d85a;38f5f85d17cb6dc960e3fb88f3856b11   302008916;300798534

ancientsinglegenome-hostassociated_libraries.tsv#L1103-L1119

jfy133 commented 8 months ago

Further info in this case: there is multiple samples in one library, but each library has a unique sample ID :roll_eyes: