Closed rjiang9 closed 2 months ago
Hi Ray,
clinical_ETL_code can convert all the dates to intervals based on the reference date, which should be the earliest date_of_diagnosis
for the donor (as you mentioned, this is set in the manifest). So it would convert this earliest date of diagnosis to 0
and calculate all other dates in relation to that, so date_of_birth would be a negative interval that represents the donor's age at first diagnosis.
In this test data, for the date_of_death
it shows that it is possible to instead submit an actual integer which represents a day or month interval, based on the donor's date_resolution
value. We provided this because some users did not have access to the raw dates and needed to submit the intervals directly.
So it is up to the user if they want clinical_etl to calculate the intervals or they want to submit intervals as integers directly. Compare these two lines in the test mapping csv https://github.com/CanDIG/clinical_ETL_code/blob/c5322991b18ad5e29972615a8a52dccc60cad681/tests/test2mohv2.csv#L13-L14
If you have the raw dates, it would be simplest to use the date_interval()
method for all the date fields in your csv mapping template so that clinical_etl calculates the intervals for you.
Hope this helps and let me know if you need any further explanation.
Thank you so much for the detailed explanations, Marion. It is very helpful. I appreciate it.
Hi Marion ,
Another question to bug you, when preparing the data files (splitting the exported REDCap data file into files). The date is required to be interval by the reference_date set in the manifest.yml file. My question is:
Do I need to precess each date field or they are going to be taken care of by ETL_code when running CSVConvert?
Thanks @mshadbolt, Ray