Closed aleksa-krolls closed 1 month ago
@aleksa-krolls I've started spec-ing this issue, but not yet done. Still have a few more information I need from you. We can sync on this on Monday.
FYI @mtuchi
@aleksa-krolls this is ready for your review.
EOD Update - #47
f01_mphss_baseline
the source data is missing DHIS2 Option Code
and Answers
column. So these data are currently not mapped metadata.json
file, We need GitHub access token that has read-write permission on OpenFn/openfn-lime-pilot
repo. See docs hereEOD Update I have manage to get the file update in GitHub using Classic Token and i have updated the first step to fetch files from openfnorg.sharepoint.com. I need to update the first job to fetch both files from sharepoint before handing this over for code review
hey @mtuchi ,
When mapping f01_mphss_baseline the source data is missing DHIS2 Option Code and Answers column. So these data are currently not mapped
On this comment, for f01_mphss_baseline
we don't need to use DHIS2 Option Code
and Answers
columns, we just need to map External ID
with DHIS2 DE UID
. This step is outlined on the issue spec as follows:
For state.f01_mphss_baseline_xls: Create a JSON with key-value pair where the key is the value under column E with header External ID and the label is column BA labeled DHIS2 DE UID. We want to exclude all rows under DHIS2 DE UID that are NA
This is what the output of the mapping should look like:
{
CXS4qAJH2qD: 'encounter-date', //encounterDate
dfdv3SkeXKe: 'a6c5188c-29f0-4d3d-8cf5-7852998df86f',
hWMBCCA2yy1: 'abede172-ba87-4ebe-8054-3afadb181ea3',
TWuCY5r2wx7: 'ccc4f06c-b76a-440d-9b7e-c48ba2c4a0ab',
QHrIUMhjZlO: 'd516de07-979b-411c-b7e4-bd09cf7d9d91',
H1fMCaOzr8F: '3e97c2d0-15c1-4cfd-884f-7a4721079217',
yCwuZ0htrlH: '5f6e245c-83fc-421b-8d46-061ac773ae71'
}
@aleksa-krolls this is ready for testing, The PR were approved and i have updated the workflow on app cc @AishaHassen
Background, context, and business value
The current implementation of MSF workflows (WF1 and WF2) have some of the mappings hardcoded in the job codes. The source for these hardcoded mappings is the Metadata file that we get from MSF. This Metadata file is going to get updated by the MSF team periodically. In order to avoid having to edit the hardcoded mapping in our job code, we need to find a way to read the Metadata file using an API, extract data that we are interested in, format it in a useable format and save it to a JSON file where WF1 and WF2 will reference it.
The specific request, in as few words as possible
Create a workflow with three steps:
Step 1: Get Metadata file from share point:
state.option_sets_xls_data
andstate.f01_mphss_baseline_xls
respectivelyNote: When we do get access to MSF sharepoint, we will be dynamically getting these json objects that we are stubbing in this step. As a proof of concept, we are only doing two json objects on this job, but the number of sheets fetched will be more than two when we fetch it dynamically.
Step 2: Map metadata file to option-set Json format
Use the
state.option_sets_xls_data
to create the following json objects:state.option_sets_xls_data
: employ the same mapping logic as was done in WF3state.f01_mphss_baseline_xls
: Create a JSON with key-value pair where the key is the value under column E with headerExternal ID
and the label is column BA labeledDHIS2 DE UID
. We want to exclude all rows underDHIS2 DE UID
that areNA
Create one main JSON object,
state.metadata_mapping_json
by merging the above json objects as:Step 3: Save option-set json to github
state.metadata_mapping_json
object and make a post request to upsert the metadata_mapping.json file in the metadata folder on Github.state.metadata_mapping_json
and name it asmetadata_mapping.json
@aleksa-krolls This is the upsert implementation im thinking, let me know your thoughts.
Expected data volumes
The size of the Metadata file that we are accessing on sharepoint - 2.2MB
Toggl
Name of the Toggle project to log work