AMP-SCZ / utility

Storehouse for all utility scripts
Apache License 2.0
0 stars 4 forks source link

Value replacement in RPMS needs revision #58

Open tashrifbillah opened 1 year ago

tashrifbillah commented 1 year ago

https://github.com/AMP-SCZ/utility/blob/5746df53869819cb5ca5783dd209237a1a94ca25/replace_RPMS_values.py#L71

This scheme does not work when a single cell has comma in it.

Example PSYCHS FU P1P8 form

tashrifbillah commented 1 year ago

Partially fixed via https://github.com/AMP-SCZ/utility/commit/004569338027c63e666f57015df99fc5cb554768 but it will not be able to replace values positioned after cells with \n. Two such examples are:

PrescientStudy_Prescient_sofas_screening_14.01.2023.csv: "chrsofas_missing" PrescientStudy_Prescient_sofas_followup_14.01.2023.csv: "chrsofas_missing_fu"

Need more work. Going through Pandas Data Frame is a reliable option but that will transform integers to floats uncontrollably.

tashrifbillah commented 1 year ago

Also fails for lines starting in LastModifiedDate but cells after cells with , .