ebmdatalab / clinicaltrials-act-converter

1 stars 0 forks source link

New QC Tracking #2

Open NickCEBM opened 5 years ago

NickCEBM commented 5 years ago

This adjusts our pipeline to now include all the details about the QC process directly in the CSV that runs the website. This will allow use to run the entire website off of the CSV and not rely on externally scraping QC info.

There are also a few minor changes and corrections throughout for both future use and convenience:

  1. All completion dates are now explicitly in the CSV rather than just the ones we use. These are used on ocassion and had to be retrieved from the intermittent steps of the old SQL query. Now they are just included here for convenience.
  2. Created some new variables both for easier usage with the website and easier use in the code. This included covered, and variables that should, when converted to bools, add up to the numerator and denominator used on the website.
  3. Fixed a bug in the defaulted_date section

In order to implement the csv fields into the QC pipeline:

  1. No longer need to scrape to look for results of due trials. inc_in_num now should calculate the numerator directly
  2. For trials in QC, we can calculate the days late (if late) from first_submission_qc
  3. cancelled_now tells you if the results are currently cancelled. If results_due is true and cancelled_now is true, the trial should enter the overdue - cancelled status on the website (this is accounted for in the inc_in_num calculation)
  4. In order to stay consistent with the way we currently handle cancelled trials, when a trial is cancelled and then resubmitted, we calculate days overdue as of the next submission after cancellation. new_submission_date gives you this date. However, once a trial moves from having pending results to having full results, then we can revert to just using results_submitted_date as we currently do to calculate days late (if any).
NickCEBM commented 5 years ago

Just want to point out that this should ideally be pretty thoroughly vetted before implementing live. Lots of moving parts here and working with the QC data array is not the most straightforward thing.