OpenFn / unicef-cambodia

UNICEF Cambodia - Primero Interoperability
https://openfn.github.io/unicef-cambodia/
1 stars 2 forks source link

Automated migration of historical Oscar cases to Primero #72

Closed aleksa-krolls closed 2 years ago

aleksa-krolls commented 3 years ago

Background, context, and business value

For the go-live, Cambodia would like to sync historical OSCaR cases to Primero using our existing jobs & mappings.

From Primero team: We ran a quick experiment on a test environment yesterday to gauge how quickly OpenFn can post cases to the Primero API this weekend when it creates the OSCaR historical data.

From Oscar team: Please divide the historical case load into 255 separate HTTP requests. So ~14,000 cases/ 255 requests = ~55 cases per request.

The specific request, in as few words as possible

  1. Create a version of the job f2-j1-getOscarCases.js to send GET requests to '/api/v1/organizations/clients' for every date between 2016-02-25 and today, filtering since_date. Please chunk these per Kiry's requirements to send 55 cases per request (see above).
  2. Oscar should respond with the cases registered in that date range.
  3. This should trigger f2-j2 to upload the cases to Primero. See above for Primero's comments on bulk uploads.
  4. We then need to run flow run to sync the Primero Ids back to Oscar... but not critical that this happens immediately after the Flow 2 is complete.

Consider that some dates may return NO cases.

I think this should be a separate historicalSync job we can run at a later date if needed. Given the volumes what is your recommended approach for this sync?

state.json

NOTE: We will not run on prod until the weekend. So in the meantime, please use these test environment logins from LP... Oscar Staging API User Primero Alpha Cambodia

adaptor

language-http language-primero

expression.js

For this migration, first we'll run flow 2... Flow 2: Oscar --> Primero

  1. https://github.com/OpenFn/unicef-cambodia/blob/master/jobs/f2-j1-getOscarCases.js --> modify date cursor to fetch historical cases
  2. https://github.com/OpenFn/unicef-cambodia/blob/master/jobs/f2-j2-upsertCasesToPrimero.js --> Here we need to consider Primero load

Flow 1: Primero --> Oscar

  1. https://github.com/OpenFn/unicef-cambodia/blob/master/jobs/f1-j1-getPrimeroCases.js
  2. https://github.com/OpenFn/unicef-cambodia/blob/master/jobs/f1-j2-casesToOscar.js --> Chunk this per Kiry's requirements

output.json

All 14k historical Oscar cases should be synced to Primero using the same mappings as the regular flow. We expect that these historical cases will not have services.

aleksa-krolls commented 3 years ago

@taylordowns2000 Let's talk through this when I'm back from my exam?

aleksa-krolls commented 3 years ago

@taylordowns2000 This is ready to build. I've updated the issue description. Keep me posted with questions - will have lots of time tomorrow to help test.

taylordowns2000 commented 3 years ago

Apologies that this didn't come up earlier. A few outstanding questions:

  1. Is OSCaR asking us to limit the number of cases that they respond with in each request? I'm not sure I know how to do that. We could ask for all the cases in a given day, but I'm not sure I know how to limit them to only provide N cases per request.
  2. When those cases are added to Primero, will the updated_at timestamp on those cases be now or some time in the past? If now, will they automatically get picked up (now in very large quantities) by the hourly Primero-to-OSCaR sync?

For the first Q, I'm confused because it doesn't look like OSCaR allows you to either (a) limit the number of clients in the response or (b) provide a date range for your request. I might be missing something (eish, I usually am) but the only way I see to do this—per their documentation—is to make a request with a since_date of 2000-01-01 and see how many cases come back. image

aleksa-krolls commented 3 years ago

@taylordowns2000 As discussed... Flow 2: Oscar --> Primero

  1. f2-j1: Get from Oscar where since_date: 2016-02-25 00:00:00.000000 and end_date: 2016-07-25 00:00:00.000000 (This is a 5 month range - send 1 GET request for every 5 months until present day).
  2. Returns 14k cases (ensure NO cases where organization_name: demo)
  3. f2-j2: Upsert in Primero

Modified Flow 1: Primero --> Oscar...

  1. f1-j1-getLinks: GET Primero where there is oscar_number: https://github.com/OpenFn/unicef-cambodia/blob/master/jobs/f1-j1-getPrimeroCases.js#L52-L74
  2. Returns 14k, create batches of 55 records
  3. f1-j2-updateLinks: Send each batch to update_links in Oscar (to sync back Primero Ids): https://github.com/OpenFn/unicef-cambodia/blob/master/jobs/f1-j2-casesToOscar.js#L266-L301
aleksa-krolls commented 3 years ago

@taylordowns2000 Updated guidance from Kiry on step 1 where we GET from Oscar (see my comment above). Can you please test this GET step #1 so that we can confirm we'd be set to move forward with this migration process?

Oscar API Fetch updates... I updated the endpoint and i will update the doc later, the Get /api/v1/organizations/clients now receive two parameters the since_date and the end_date. Form example: /api/v1/organizations/clients?since_date=2019-05-28 12:00:00.000000&end_date=2020-05-28 12:00:00.000000 Please use the date format like so: 2019-05-28 12:00:00.000000 So it can paginate properly. Now you can give it a range between one month or one year. I recommend to give the date range below a 5-month range so that it is fast to fetch the data.

aleksa-krolls commented 3 years ago

@taylordowns2000 kiry says the GET with the date range is working again

aleksa-krolls commented 2 years ago

Abandoned.