jhu-bids / TermHub

Web app and CLI tools for working with biomedical terminologies. https://github.com/orgs/jhu-bids/projects/9/views/7
https://bit.ly/termhub
GNU General Public License v3.0
11 stars 10 forks source link

ValueSet-Tools: When uploading to cset name that already exists, add new version ONLY IF DIFFERENT #49

Open Sigfried opened 2 years ago

Sigfried commented 2 years ago

I'm not sure what the current behavior is, but we need our software to do something reasonable if we try to add the same cset twice, if we add a different cset with the same name, or if we add a genuinely new version of an existing existing cset.

Should probably:

joeflack4 commented 2 years ago

Very good idea.

Sigfried commented 2 years ago

@stephanieshong, you asked me to wait on this work until Push enclave_wrangler code so Siggie can work is done. Is there anything I can do to help?

stephanieshong commented 2 years ago

I do not think you need to do this elaborate check prior to uploading since there is already a UI tool within the enclave to do this analysis. And it would very difficult to decide without a clinician looking at the expressionItem which descendant should be included or not. This is usually a clinician's review step with the clinical knowledge.

Final metadata check is done from the UI, currently user cannot make a draft version a non-draft version.

joeflack4 commented 2 years ago

From my perspective, if we have a bunch of quality control checks we can run prior to uploading, that's not a bad idea in theory. From what I was thinking that Siggie meant, it would be easy to check if a new version we're uploading is exactly the same as an old version; wouldn't need a clinician to check; the script would just check the codes or any other fields we think are critical to check, and make sure that they're not exactly the same.

Sigfried commented 2 years ago

Yes. The point of this is just to make it so you can use this tool to add concept sets and not have to worry if you already added it; the software will figure it out for you. If you did already add it, it will tell you that and you're done. If not, you're probably trying to add a new version on purpose -- but, perhaps you didn't even know that a concept set with that name existed, so you can check and make sure you're not clobbering something you shouldn't... etc. So, as Joe said, this is all prior to looking at the clinical terminology issues.

stephanieshong commented 2 years ago

The following condition only applies to the enclave versions. Certain use cases that will be important to consider:

  1. If the archive flag is set than cs is not displayed in the UI, however it still exist in the cs dataset in the enclave. So we would need to check for this flag not just against the cs container name(key field in the Enclave)
  2. if the draft version exist, would it be possible to validated against a draft version? Not sure if you can check against the draft version If the draft version already exist should we validate against a draft version? or check against the none-draft version? Currently, you can add additional draft versions to the concept set.
  3. I think the most useful test case would be to validate against the draft version that has been already uploaded such that we can just make the draft version to non-draft without adding additional concept sets.
  4. Currently, UI confirmation is required before the draft version can become non-draft version.