OuhscBbmc / StatisticalComputing

OUHSC's SCUG (Statistical Computing Users Group)
MIT License
8 stars 3 forks source link

strategy for notification when a live dataset is changed #7

Closed wibeasley closed 6 years ago

wibeasley commented 7 years ago

To @OuhscBbmc/dhswaiverpushpull, @mand9472 and others, @Maleeha recently asked "is there a possibility that a notification is sent to those who are responsible for the variables whenever they change on REDCap?". For context, we have making some necessary changes to the REDCap database where people are entering live data.

I don't have a good solution for this recurring scenario, and I would appreciate suggestions how we might do it better. I'm affected both as a producer (eg, the main person responsible for the DHS Waiver ferry) and as a consumer (eg, OSDH's ETO). Here are the strategies that we currently (try to) employ:

  1. Make changes to the data source in large, but infrequent batches. I think a slow trickle of changes is harder (for the ~7 analysts on our project) to monitor.

  2. Assign a questionnaire/instrument to as few analysts as possible, and encourage that analysts to feel ownership of that questionnaire and it's maintenance. That also reduces the number of people affected; for example.

  3. If only a few variables are changed, then grep the variable name and notify the analysts owning the files containing that variable name (eg $ grep -rl "iss_assigned_date" ~/Bbmc/DhsWaiver).

  4. If broad changes occur, then make a general announcement and request that everyone check their reports against the development version of the database/ferry (eg, the issue called "prospectively check your reports against the v1.1 ferry").

    Th recent change affected all most all the important variables (eg, the group assignment), so I assigned it to almost everyone. I didn't assume anyone's reports would be immune.

    image

  5. This transition from the old database to the new database is tough, and we avoid doing it when reports need to be run. I know the proper software engineering strategy is to branch the repo, and eventually merge it into the master after everything's tested. Although I do this for proper R packages, I've been reluctant to branch repos that mainly hold analysis code and reports.

spun off of #177 (cc: @andkov & @aggie-dba)

Maleeha commented 7 years ago

Thank you @wibeasley. .