mrc-ide / naomi

Naomi model for subnational HIV estimation
https://mrc-ide.github.io/naomi
Other
9 stars 9 forks source link

Format vmmc output #424

Closed jeffeaton closed 8 months ago

jeffeaton commented 8 months ago

This PR formats the DMMPT2 output and appends it to the PEPFAR Datapack CSV if

It builds on #418. I have made a separate PR in case you want to merge that one immediately before reviewing this, but they can be combined and merged together.

This code does not check that the DMMPT2 file area_id are consistent with the Naomi fit. What do we want to test there, and how do we want it to fail (error vs. warning)?

Options for what to check:

  1. All DMPPT2 area_id are contained in the Naomi fit area_id
  2. All Naomi fit area_id are contained in the DMPPT2 output (we don't want this option)
  3. All Naomi fit area_id at the PSNU level are contained in the DMMPT2 output file

I think we want option 3.

I'm not sure how it should fail:

One option would be to write the Naomi zip file, but omit the PEPFAR CSV if there is an issue with the DMMPT2 file. (Still could be slightly annoying, but less so than error writing the Naomi zip altogether).

vimc-robot commented 8 months ago

Thanks. Corresponding hintr PR at https://github.com/mrc-ide/hintr/pull/489

jeffeaton commented 8 months ago

This code does not check that the DMMPT2 file area_id are consistent with the Naomi fit. What do we want to test there, and how do we want it to fail (error vs. warning)?

Related to this in hintr I am validating that the regions in the uploaded VMMC file. If any region appears in the VMMC file which is not in the shape file, then the VMMC file will raise a validation error.

Great — I think this is reasonable. And then user has the option to "remove" the VMMC file and proceed with fitting (and VMMC file will be passed as NULL)?

I'm not sure how it should fail:

Writing the zip file but omitting the PEPFAR CSV sounds good to me. We could raise a warning too so that it is clear to users why it is missing.

You could still create the datapack CSV but without the DMMPT2 indicators, thought that might be more confusing about why they are missing.

Let's go with the "create data pack CSV without DMPPT2 indicators + warning" option. This shouldn't really happen if the VMMC area_ids are being validated on upload.

If we split the datapack and output zip into a separate downloads down the line it could avoid this pain. We could have a proper error if datapack fails to generate but the output zip would generate fine.

Yes this is the ideal eventual resolution.

jeffeaton commented 8 months ago

I'm not sure how it should fail:

Writing the zip file but omitting the PEPFAR CSV sounds good to me. We could raise a warning too so that it is clear to users why it is missing. You could still create the datapack CSV but without the DMMPT2 indicators, thought that might be more confusing about why they are missing.

Let's go with the "create data pack CSV without DMPPT2 indicators + warning" option. This shouldn't really happen if the VMMC area_ids are being validated on upload.

Actually, on further thought — there are a couple of countries (Kenya, Ethiopia) where the DMPPT2 results are only for some districts, not the whole country. This check might create an issue for those (not sure how the DMPPT2 output file handles these).

For now, let's just go with your upload check (DMMPT2 area_id are all contained in Naomi file) and leave it at that.

jeffeaton commented 8 months ago

Thanks 2efe848 should fix the broken test. This is ready for re-review.