Open katewarner opened 1 month ago
This is because the mongoDB containing records/rows from human_glycogenes_glycoenzonto.csv is old and was forgottedn to be updated after we fix the duplication issue. This will go away when we push 2.7.1 to prd. Please keep this ticket with you and check it when we update data.tst.glygen.org
Please check your script for processing the human_protein_glycogenes_glycoenzonto.csv dataset
Jeet noticed that there are duplicated rows in the
human_protein_glycogenes_glycoenzonto.csv
dataset (https://data.glygen.org/GLY_000922). e.g. Q96EU7-1, Q2PZI1-1, U3KPV4-1 etc.I checked the downloaded file you use to create the dataset (
/data/projects/glygen/downloads/glycogenes/current/human_glycogenes_glycoenzonto.csv
) and I couldn't find any duplicated rows so I think there may be an issue with the processing script for this dataset.