Closed fplazaonate closed 1 year ago
Hi @fplaza I'm not following, could you please elaborate?. Cheers
Hi @mberacochea ,
The exact same genomes (i.e duplicates) are present several times (eg: MGYG000002160, MGYG00003925, MGYG000180883).
Hi Florian,
Thank you for pointing these out and sharing the list with us. These are in fact duplicates. We checked the source information and can see that some genomes are present twice because the original studies from which the genomes were obtained reused some of the samples. This is not unexpected in a large study like this but we will look into removing the duplicates in the future updates.
Thanks again! Tanya
Hi,
I have noticed that some genomes are exactly the same in the UHGG v2.
Here is the list:
duplicated_genomes_metadata.txt
It would be great to fix this in the future versions.
Best, Florian