I was trying to load gwas_catalog_trait-mappings_r2024-07-27.tsv in a MySQL table and the uniquness constraint kept failing.
I tracked down the issue to a duplicate entry for the Complement C5 levels entry (http://www.ebi.ac.uk/efo/EFO_002027) - one row has a \<FEFF> character (in the word 'Complement') and the other one does not:
Complement C5 levels complement C5 measurement http://www.ebi.ac.uk/efo/EFO_0020278 Other measurement http://www.ebi.ac.uk/efo/EFO_0001444
C<feff>omplement C5 levels complement C5 measurement http://www.ebi.ac.uk/efo/EFO_0020278 Other measurement http://www.ebi.ac.uk/efo/EFO_0001444
Upon inspection, <FEFF> is present in other values in the file as well e.g.
Total cholesterol in IDL meal response (<feff>OrNLSr)
Triglycerides levels in very large VLDL meal response (<feff>OrNLSr)
I was trying to load
gwas_catalog_trait-mappings_r2024-07-27.tsv
in a MySQL table and the uniquness constraint kept failing.I tracked down the issue to a duplicate entry for the
Complement C5 levels
entry (http://www.ebi.ac.uk/efo/EFO_002027) - one row has a \<FEFF> character (in the word 'Complement') and the other one does not:Upon inspection,
<FEFF>
is present in other values in the file as well e.g.