Closed pradeeban closed 3 years ago
I have made a python script for this Here is the Github link to the repo: https://github.com/Nishchal-007/CheckingNull Also, there is a dummy CSV file for testing purpose
Let me know if there's anything else to be done.
A few pointers:
empi, accassion 32323, 32323323232 23323, 323232, 332323 , 3232332323
In this case, the line 2 and 4 should be skipped, but the rest of the extraction should go on.
It is not the entire CSV that is empty or should be considered invalid and ignored. Rather, making the executions succeed, despite the missing lines.
Also, I recommend cloning the repository and submitting pull requests, rather than creating private repositories with stand-alone codes. The code changes will modify the https://github.com/Emory-HITI/Niffler/blob/dev/modules/cold-extraction/ColdDataRetriever.py (Please use the dev branch).
Okay, Will do that !!
I have forked and made changes to the cold-extraction module and added the checkCSV.py file. Can you please check it out?
That pull request modifies 67 files. Please resubmit a new pull request without modifying other files, also addressing other comments in the pull request.
Yeah I checked the points in the pull request My approach to this current issue is if there are any missing values in the file I'm just dropping them in the beginning itself Is this approach okay or should I change this?
And regarding the file changes, I did "add all" that's why it affected all the files But no file is modified or changed internally
Ideally, you should drop it on the go, as otherwise, this process of reading the file will introduce an initial delay.
Also, there is no reason to create a separate python class. You can modify modules/cold-extraction/ColdDataRetriever.py.
You can use the "csv_file" variable there as it already points to the csv_file.
Okay
And also I'm applying for GSOC 2021
Made the necessary changes in the ColdDataRetriever.py file
Please pay attention to the comments.
https://github.com/Emory-HITI/Niffler/pull/115/files changes files unnecessarily. Please avoid changing files that have no relevance to the pull request. Also, I believe you are modifying only the ColdDataRetriever.py and other changes (such as a sample CSV file) are not necessary.
Only changed the ColdDataRetriever.py file
Added some important comments in your pull request. Please attend to them and submit an updated request when you have tested the changes.
A comment that is more specific to elaborate this issue:
Please note that you should check only for the missing entry of the mandatory fields.
Hey, I made the required changes to the ColdDataRetriever.py file
Can you please check it out?
Looks good now. Merged to dev. Thanks.
For example, missing accession in an EMPI-Accession extraction.