Sage-Bionetworks / Genie

Validation and processing of GENIE files
https://genie.synapse.org/
MIT License
12 stars 9 forks source link

[GEN-846] & [GEN-845] Ignore case and allow underscores in cross-validate #536

Closed rxu17 closed 1 year ago

rxu17 commented 1 year ago

Purpose: We would like to ignore case and allow underscores when cross-validating SEQ_ASSAY_IDs in clinical files against assay_information files' SEQ_ASSAY_ID values and bed file names.

This involves an addition of a new function to validate.py called standardize_string_for_validation. This function is called in all the affected clinical-assay_information and clinical-bed cross-validation functions within the clinical.py script.

Example) Something like SAGE_tEst-1 should compare as equal to something like Sage-test-1

This also completes JIRA ticket: GEN-845