Closed joeflack4 closed 2 weeks ago
the concept set with 0 patient counts and 0 record counts are newly created around 3:45 today. So it is less than day old if that helps in debugging.
@Sigfried @joeflack4 - I found the issue with the 0 patient counts. The term usage count is generated in the Enclave and displayed on the termHub. So if the term usage count is not updated in the Enclave the termHub will also display 0 counts for both patients and the records. In the Enclave if the associated research project for the concept set points to the tenant project then the the term usage count will not get updated at all, including the concept set overlap. Thus the 0 counts on the termHub.
Once this issue is resolved in the Enclave the termHub will be able to correctly display the patient and record counts. This issue brings up a bigger issue where the Enclave is now supporting two different data sources, one for COVID and another for tenant. We will have to decide how the term usage count should reflect the data repository for two different cohort on the back end.
Notes that can be helpful, in the Enclave we now have two data repositories ( N3C COVID and N3C Clinical(tenant)), as such the Concept Set TermUsage tab needs to know which data repository it should reference before it can display the patient count and record counts. If the user has access to the only one we can default to one base on the user permission settings.
1.operational person like me who has access to both data sources will now have to indicate (choose) which data sources to reference before the concept set browser can generate the correct patient /record counts.
The main issue is that the patient count and the record count is not being updated. And this is due to the fact that the current functionality of the termUsage counts in Concept Set browser in the Enclave is Research Project dependent. If the RP is set to tenant the usage count will return 0.
@Sigfried Do you think this has been fixed?
@joeflack4, I don't quite understand it. Is it replicable?
@Sigfried I think it is fixed because the GitHub action for the counts is working.
Overview
Stephanie was trying to use this on TermHub dev (#489): https://icy-ground-0416a040f.2.azurestaticapps.net/OMOPConceptSets?codeset_ids=417730759&codeset_ids=423850600&codeset_ids=966671711&codeset_ids=577774492 but experienced some issues.
Screenshots & comments
I did get 0 for patient counts for some csets, but maybe this is true for those?: And this screenshot is from Stephanie: ![0 patients steph](https://github.com/jhu-bids/TermHub/assets/13045020/d31746bd-acf2-46c2-a3a8-590052690fbb) One of those csets has 0 members, so I can understand why the counts would be 0. But the other ones with 0 counts have many members, so is it not surprising that the counts would be 0? Edit: I checked the database and there are no records for those instances where the counts are 0.
Solutions
concept_set_counts_clamped
(need to fix GH action)Possible solution details
1. Periodic fetches of
concept_set_counts_clamped
(need to fix GH action)Currently blocked by GH action failing.
It may be a disk space issue. If so, some ideas: a. delete the datasets/ files after doing prepped_files, leaving only prepped b. download and upload, and delete files 1 at a time c. counts/vocab separate actions d. download/upload big tables in chunks (e.g. 50% of the parquet)
Original comments (comlpeted)
@Sigfried Correct me if my understanding of how we get the counts in the UI is wrong. I have a bunch of questions about this. i. Is the primary source of this information coming from the `concept_set_counts_clamped` table? If so, isn't that something major that we overlooked? The DB refresh not set up to fetch that. ii. Should we be fetching this table dataset once or more a day? iii. And while we're at it, any other non-vocab datasets that we also cannot get by fetching through the Objects API, if there are any others? iv. Or am I wrong, and is this table not the primary way that we are getting counts? v. If it's not the primary way, doesn't it still need to be updated? vi. Have you been updating this table periodically? vii. If so, how?