tskit-dev / tsinfer

Infer a tree sequence from genetic variation data.
GNU General Public License v3.0
56 stars 13 forks source link

Allow the same scheme name, if description matches #930

Closed hyanwong closed 3 months ago

hyanwong commented 3 months ago

Fixes #929

codecov[bot] commented 3 months ago

Codecov Report

All modified and coverable lines are covered by tests :white_check_mark:

Project coverage is 87.04%. Comparing base (1d45c0c) to head (0273cf8).

Additional details and impacted files ```diff @@ Coverage Diff @@ ## main #930 +/- ## ======================================= Coverage 87.04% 87.04% ======================================= Files 5 5 Lines 1767 1767 Branches 310 310 ======================================= Hits 1538 1538 Misses 140 140 Partials 89 89 ``` | [Flag](https://app.codecov.io/gh/tskit-dev/tsinfer/pull/930/flags?src=pr&el=flags&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=tskit-dev) | Coverage Δ | | |---|---|---| | [C](https://app.codecov.io/gh/tskit-dev/tsinfer/pull/930/flags?src=pr&el=flag&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=tskit-dev) | `87.04% <ø> (ø)` | | Flags with carried forward coverage won't be shown. [Click here](https://docs.codecov.io/docs/carryforward-flags?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=tskit-dev#carryforward-flags-in-the-pull-request-comment) to find out more.

:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.

jeromekelleher commented 3 months ago

I would rather we just skipped this and used the permissive json schema instead. Keeping schemas up to date is more trouble than it's worth.

hyanwong commented 3 months ago

Happy to use permissive_json, but don't we still want to warn if we are stomping on existing data. Having a defined schema is a nice way to do that.

hyanwong commented 3 months ago

Also, it's useful in tsdate because we are using short, non-intuitive names ("mn" and "vr") for the posterior means and variances, so documenting them is pretty useful IMO.

jeromekelleher commented 3 months ago

Sure, whatever is the shortest path here. Note that the schema is only one thing - if you want to truly check that you're not clobbering then you'd have to check the actual metadata items on the way in, because the existing schema may not declare all the fields.

hyanwong commented 3 months ago

Thanks. I think we should allow clobbering, but only if it's clear that the field was originally put in by tsinfer anyway. Checking the description is a reasonable way to do that (but actually, we should check the whole definition field, so I'll adjust this PR.). I think we warn when clobbering if the definition matches, but error out if the definition does not match.

hyanwong commented 3 months ago

I'm happy with throwing a warning in this case.

Great. Just made a minor change. Ready to review and hopefully merge @benjeffery ?