stephenturner / kgp

1000 Genomes Project Metadata R Package
https://stephenturner.github.io/kgp/
Other
19 stars 3 forks source link

need to make correction for the latitude and longitude of several populations in metadata/igsr_populations.tsv #8

Closed carolhuaxia closed 1 year ago

carolhuaxia commented 1 year ago

It is very graceful for the author to gather such an useful resource from 1KGP, HGDP and SGDP. I found there are some mistakes of the Latitude and longitude coordinates of several populations in 1KGP, such as CEU, STU, ITU. The mistakes are obvious in the homepage map. Maybe there are other mistakes. It would be great, if the mistakes are corrected. The locations in the map are inconsistent with the map of the 1KGP3 paper (A global reference for human genetic variation. Nature 526, 68–74.). Now I find the coordinates here are for the sampling locations.

stephenturner commented 1 year ago

@carolhuaxia -- could you let me know which populations you think are in error? This data is directly from the IGSR population page:

https://www.internationalgenome.org/data-portal/population

image

<html xmlns:v="urn:schemas-microsoft-com:vml" xmlns:o="urn:schemas-microsoft-com:office:office" xmlns:x="urn:schemas-microsoft-com:office:excel" xmlns="http://www.w3.org/TR/REC-html40">

Population code | Population elastic ID | Population name | Population description | Population latitude | Population longitude | Superpopulation code -- | -- | -- | -- | -- | -- | -- FIN | FIN | Finnish | Finnish in Finland | 60.17 | 24.93 | EUR CHS | CHS | Southern Han Chinese | Han Chinese South | 23.13333 | 113.266667 | EAS KHV | KHV | Kinh Vietnamese | Kinh in Ho Chi Minh City, Vietnam | 10.78 | 106.68 | EAS BEB | BEB | Bengali | Bengali in Bangladesh | 23.7 | 90.35 | SAS PUR | PUR | Puerto Rican | Puerto Rican in Puerto Rico | 18.4 | -66.1 | AMR ACB | ACB | African Caribbean | African Caribbean in Barbados | 13.1 | -59.62 | AFR ASW | ASW | African Ancestry SW | African Ancestry in Southwest US | 35.483 | -97.53333 | AFR YRI | YRI | Yoruba | Yoruba in Ibadan, Nigeria | 7.4 | 3.92 | AFR GWD | GWD | Gambian Mandinka | Gambian in Western Division, The Gambia - Mandinka | 13.454876 | -16.579032 | AFR JPT | JPT | Japanese | Japanese in Tokyo, Japan | 35.68 | 139.68 | EAS MSL | MSL | Mende | Mende in Sierra Leone | 8.48 | -13.23 | AFR CEU | CEU | CEPH | Utah residents (CEPH) with Northern and Western European ancestry | 40.767 | -111.8904 | EUR ESN | ESN | Esan | Esan in Nigeria | 9.06666 | 7.483333 | AFR CHB | CHB | Han Chinese | Han Chinese in Beijing, China | 39.916666 | 116.383333 | EAS CLM | CLM | Colombian | Colombian in Medellin, Colombia | 4.58333 | -74.066666 | AMR CDX | CDX | Dai Chinese | Chinese Dai in Xishuangbanna, China | 22 | 100.78 | EAS PEL | PEL | Peruvian | Peruvian in Lima, Peru | -12.04 | -77.03 | AMR PJL | PJL | Punjabi | Punjabi in Lahore, Pakistan | 31.554606 | 74.357158 | SAS IBS | IBS | Iberian | Iberian populations in Spain | 40.38 | -3.72 | EUR TSI | TSI | Toscani | Toscani in Italy | 42.1 | 12 | EUR MXL | MXL | Mexican Ancestry | Mexican Ancestry in Los Angeles, California | 34.0544 | -118.2439 | AMR LWK | LWK | Luhya | Luhya in Webuye, Kenya | -1.27 | 36.61 | AFR GIH | GIH | Gujarati | Gujarati Indians in Houston, TX | 29.7589 | -95.3677 | SAS STU | STU | Tamil | Sri Lankan Tamil in the UK | 52.486243 | -1.890401 | SAS ITU | ITU | Telugu | Indian Telugu in the UK | 52.486243 | -1.890401 | SAS GBR | GBR | British | British in England and Scotland | 52.486243 | -1.890401 | EUR

carolhuaxia commented 1 year ago

Thank you for your reply. At the beginning, I thought the locations of the populations CEU, GIH, STU, ITU were incorrect, because they did not match the position in the 1KGP 2015 paper. I pointed in the map below. Map from 1KGP paper: image

Map from your script: 1681273150041

But now, I understand the disagreement, because the locations in your script are the sampling positions instead of population ancestry locations.