smart-on-fhir / chart-review

Measure agreement between chart reviewers.
https://docs.smarthealthit.org/cumulus/chart-review/
Apache License 2.0
3 stars 1 forks source link

Making `labelstudio-export.json` filename configurable #36

Open Dtphelan1 opened 3 months ago

Dtphelan1 commented 3 months ago

Motivation

In the llm-symptom-study experiments I'm running (and I imagine many experimental setups) I have two sets of relevant labels for our experiment – those used in tuning, and those used in testing our final models. My solution for easily switching between those two chart-review evaluation steps is to have two export files – labelstudio-export.json.tuning and labelstudio-export.json.test – and manually copy over the relevant file into labelstudio-export.json before running chart-review in the project directory.

It's not the end of the world, but it would be nice to be able to specify – either as a runtime command or as an option in the config – the name of the relevent labelstudio-export file, defaulting to labelstudio-export.json. Happy to discuss more, but wanted to write this struggle down as I was having it 😄

https://github.com/smart-on-fhir/chart-review/blob/41fcc7d497e5e101b3f7e7c1f0f05d7fde1bd9c2/chart_review/cohort.py#L28

mikix commented 3 months ago

Interesting, and reasonable.

Currently, we have two different "relocation" features:

So thinking out loud, how would a dynamically named export file best fit in...

You say you have two different sets of labels, but just for clarity when you say labels there, you are talking about two different set of "applied labels" (annotations / reviewed charts) yeah? And you're still using the same label setup for both.

An option in the config for this would slot into the existing --config workflow. But would that be annoying? How many of these exports do you have, and is that the only difference between your runs? (i.e. would you have the exact same config file, just with one line different?) -- That might be fine, just trying to get a sense of scope.

Dtphelan1 commented 3 months ago

You say you have two different sets of labels, but just for clarity when you say labels there, you are talking about two different set of "applied labels" (annotations / reviewed charts) yeah? And you're still using the same label setup for both.

Correct. Being more specific

Right now I provide the --config option to specify which of the two config files I'm using, which is why my intuition was to have some similar CLI option for specifying what the labelstudio-export.json is named. Constraining myself to what's available today, maybe the actual guidance might be to have two subdirectories wherever I'm running this calculation, one for each dataset/labelstudio-export/config, and specify the directory as my runtime CLI option instead?

I can work with that on my end, but from a UX perspective I'm clarifying that when I saw "oh I can specify a directory altogether if I'm running this command globally or something, and oh I can specify the config file as needed", that I thought "surely I can do the same for the labelstudio-export.json file?" Also just thinking out-loud.

mikix commented 3 months ago

Hmm. I'm sympathetic to the idea that "magic" filename locations is not great - so labelstudio-export.json should probably be configurable somewhere.

I might prefer a config file field (like export-file:) for that then? To keep the number of ways your whole config can pivot down to just the --config flag. Especially since in your case, they need different configs anyway. The stuff in the config is closely tied to the contents of the export file, so a config pointing at its own export file makes some sense to my brain.

BUT... for your case, it does sound like there's little enough overlap between everything and you really are more of a "two different --project-dir flags" kind of setup, with whole separate folders.

Dtphelan1 commented 3 months ago

BUT... for your case, it does sound like there's little enough overlap between everything and you really are more of a "two different --project-dir flags" kind of setup, with whole separate folders.

I agree 😄

I might prefer a config file field (like export-file:) for that then. To keep the number of ways your whole config can pivot down to just the --config flag.

I think that sounds reasonable! It is a config-file after all, it makes sense that more granular configuration would live there. I'm not too opinionated on naming, so whatever makes the most sense to you!

Thanks Mike 👍