OHDSI / Atlas

ATLAS is an open source software tool for researchers to conduct scientific analyses on standardized observational data
http://atlas-demo.ohdsi.org/
Apache License 2.0
274 stars 138 forks source link

UI cohort logic requires optimization due to performance issue on big number of concept sets #1018

Closed vantonov1 closed 6 years ago

vantonov1 commented 6 years ago

Cohort definition page is slow if large number of concept sets included (see attachement for example)

JSON.zip

pavgra commented 6 years ago

The issue is still actual: editing of e.g. text printed into description field of Additional Qualifying Inclusion Criteria appears there with a huge delay

chrisknoll commented 6 years ago

Ok, we can reopen this.

I am wondering where we draw the line at 'this is a valid usecase' vs 'this is an invalid use case'. This JSON object is 12MB. That's fairly large file, since the dirty check serializes the cohort def and compares it to an 'initial state' document it takes a lot of time to work.

While I think I can try an optimization where we break up the different parts of a cohort def into separate 'dirty flag' trackers, at what point do we say the cohort definition is not appropriate? One of the concept sets (HEDIS 2018 Trauma (2.16.840.1.113883.3.464.1004.1254)) has over nineteen thousand concepts in it. Shouldn't this be using the 'maps to' function of concept sets to pull in all those source concepts that are mapped to some standard concepts? I imagine we could cover 90% of those source concepts by just a handful of standard concepts + the maps-to function of concept sets. Should we limit the max concepts in a concept set to 1000? or 2000? The other concept sets don't seem to suffer the performance problem, it's only this 19,000 one that causes the issue in the UI. And I question if each of those 19,000 codes was vetted that they actually belong in the definition.

In any case, I'll try an approach to partition out the dirty checks so that only those elements that the entire document doesn't need to be dirty checked when it's a minor change. But, we all should think about when we are workign outside the reasonable bounds of what we should be able to support (should we be able to download the entire vocabulary table into a concept set and use it as the 'any' concept set? I'm kidding of course).

-Chris

pavgra commented 6 years ago

@chrisknoll , for us this is a valid use-case. I've re-opened the task, I'll deal with it and will ask for help if needed. Thanks.

pavgra commented 6 years ago

Re-opening due to issues with the new flag approach: image