Open eks-ebi opened 3 years ago
Recently, I am trying to extract studies from the GWAS catalog website and I think there could be a bug in the downloaded CSV file. Here are the details:
Summary: Different ancestries are not separated by a delimiter in the download CSV file.
Submit Date: 14-09-2021
Reporter: YUE JI
Platform: GWAS catalog
Operating System: macOS Big Sur (version 11.5.2)
Browser: Google Chrome
URL: https://www.ebi.ac.uk/gwas/efotraits/EFO_0003770
I am interested in the studies of diabetic retinopathy (EFO_0003770), so I searched the EFO_0003770 in the GWAS catalog and downloaded CSV files of one study (GCST007292) from the studies table.
In the downloaded CSV file, multiple ancestries in the column of "Discovery sample number and ancestry" are not separated by any delimiters. It is the same in the "Replication sample number and ancestry" column.
I also tried downloading all studies of this trait, and this problem repeatedly happened to studies "GCST007289" and "GCST003042".
Another try is I downloaded all studies of another trait "EFO_0004574"(https://www.ebi.ac.uk/gwas/efotraits/EFO_0004574). There is no separator in these two columns for multiple ancestries.
There is no separator in the column to distinct multiple ancestries.
"1852 African American or Afro-Caribbean3094 European" in "Discovery sample number and ancestry”, and there is no delimiter.
"984 South Asian2710 Hispanic or Latin American15918 European771 South East Asian13703 East Asian" in the column of "Replication sample number and ancestry” and there is no delimiter.
If a solution is found, we should inform Yue: yueji@ebi.ac.uk
This is a bug in the construction of the CSV file generated from the Studies table in the GWAS Catalog UI.
Reported by Yue Ji from Aoife's group, so I will copy Yue's email below: