populationgenomics / metamist

Sample level metadata system
MIT License
1 stars 1 forks source link

CohortCriteria is bananas! #746

Closed vivbak closed 4 months ago

vivbak commented 4 months ago

The CohortCriteria model, and indeed other models in metamist, seemingly allow you to pass whatever params you like to it.

This seems problematic -- one for discussion w/ the team?

MattWellie commented 4 months ago

Fav issue title to date 🍌

jmarshall commented 4 months ago

Here's the example that led to this:

-- a/scripts/create_custom_cohort.py
++ b/scripts/create_custom_cohort.py
@@ -23,12 +23,13 @@ def main
     capi = CohortApi()
     cohort_criteria = CohortCriteria(
         projects=projects or [],
         sg_ids_internal=sg_ids_internal or [],
         excluded_sgs_internal=excluded_sg_ids or [],
         sg_technology=sg_technologies or [],
         sg_platform=sg_platforms or [],
         sg_type=sg_types or [],
         sample_type=sample_types or [],
+        fruit=['banana'],
     )

Here the CohortCriteria object acquired a fruit field even though no such field appears in its definition.

And previously we accidentally had sample_types=sample_types or [], whereby the CohortCriteria silently accepted the misnamed field and its intended sample_type field was always left unset/empty.

jmarshall commented 4 months ago

Turns out this is a configuration setting in pydantic for which the default is to ignore unknown fields — which is a bit surprising as we seemed to be getting the allow behaviour, as the fruit was preserved.

PR #752 sets forbid mode and works through the consequences.

jmarshall commented 4 months ago

That PR is merged now, so our model classes will now insist on parameters being real field names rather than typos.

(…or most of them will — see the late-breaking comment on the PR…)