This is most probably an issue upstream with the data submission.
Example:
https://www.targetvalidation.org/evidence/ENSG00000133703/EFO_0000096
Title of the abstract: Circulating tumour DNA sequence analysis as an alternative to multiple myeloma bone marrow aspirates.
Sentence not highlighted:
"We report here a hybrid-capture-based Liquid Biopsy Sequencing (LB-Seq) method used to sequence all protein-coding exons of KRAS, NRAS, BRAF, EGFR and PIK3CA in 64 cfDNA specimens from 53 myeloma patients to >20,000 × median coverage.:
Please note the '×' returning by the Open Targets API. In the original EMPC article this will encode for a times character (unicode × ).
The most likely explanation is the way the information is extracted by EMPC to produce the evidence strings. It may be that it's not unicode-encoded. The other explanation would be that the data are correctly encoded in the original submission but the data pipeline doesn't handle this properly.
I've already updated the pipeline to handle UTF-8 characters better - see opentargets/data_pipeline#322 We should check if this is still a problem with the 18.10 release.
This is most probably an issue upstream with the data submission. Example: https://www.targetvalidation.org/evidence/ENSG00000133703/EFO_0000096 Title of the abstract: Circulating tumour DNA sequence analysis as an alternative to multiple myeloma bone marrow aspirates. Sentence not highlighted: "We report here a hybrid-capture-based Liquid Biopsy Sequencing (LB-Seq) method used to sequence all protein-coding exons of KRAS, NRAS, BRAF, EGFR and PIK3CA in 64 cfDNA specimens from 53 myeloma patients to >20,000 × median coverage.:
Please note the '×' returning by the Open Targets API. In the original EMPC article this will encode for a times character (unicode × ).
The most likely explanation is the way the information is extracted by EMPC to produce the evidence strings. It may be that it's not unicode-encoded. The other explanation would be that the data are correctly encoded in the original submission but the data pipeline doesn't handle this properly.