malariagen / malariagen-data-python

Analyse MalariaGEN data from Python
https://malariagen.github.io/malariagen-data-python/latest/
MIT License
13 stars 23 forks source link

Access cohort group data #436

Closed ahernank closed 6 months ago

ahernank commented 10 months ago

Here we create a new class to host the cohort group metadata functions. The reasoning is to keep this apart from the usual sample_metadata (and the cohorts-related functions there) as those are all/can be accessed on a sample level; whereas these new functions are only accessible on a cohort level.

I have also updated the default cohort analysis versions (and test files, params affected) as I think it is a good idea to keep a single source of cohort_analysis as of now, rather than allow the user to change the analysis on these functions, which could cause confusion with the rest of the functions.

Here, we only tackle the csv metadata files. I attempted to also include the geojson files but this is causing some issues related to large data files and the geoJSON parsers required to access the file properly; so might be best to re-think/tackle those in another PR.

codecov[bot] commented 10 months ago

Codecov Report

All modified and coverable lines are covered by tests :white_check_mark:

:exclamation: No coverage uploaded for pull request base (master@0978276). Click here to learn what that means.

Additional details and impacted files ```diff @@ Coverage Diff @@ ## master #436 +/- ## ========================================= Coverage ? 98.82% ========================================= Files ? 35 Lines ? 3504 Branches ? 0 ========================================= Hits ? 3463 Misses ? 41 Partials ? 0 ```

:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.

alimanfoo commented 10 months ago

Hi @ahernank, codecov is complaining because one line of the code you've added in this PR isn't covered by any tests. Wouldn't hurt to test explicitly providing a cohorts analysis version. Probably easiest way would be to modify either the ag3_sim_api or af1_sim_api function to include a cohorts_analysis parameter.

ahernank commented 6 months ago

To solve the issues related to cohorts with quarters not having cohort data and ensure coverage is correct, I’ve added cohort data tests just using the Ag3 sim fixture. I’ve used cohorts without quarters for the Af1 simulator. This way we can avoid a new simulator.

alimanfoo commented 6 months ago

Also remembering would be good to add the new cohorts() method to the API docs.