openpharma / clinsight

ClinSight - An application for medical monitoring within clinical trials
https://openpharma.github.io/clinsight/
Other
3 stars 0 forks source link

DB Creation and Updating #114

Open jthompson-arcus opened 1 month ago

jthompson-arcus commented 1 month ago

https://github.com/openpharma/clinsight/blob/f32e09d92b0cc7893556ddedf875d4cd3fd3e249/R/run_app.R#L66-L74

Why is the database creation and updating not part of the preprocessing? It seems like since we already require the creation of the study data and meta data that it makes sense to update the database when study data is created, not every time the application is run.

LDSamson commented 1 month ago

You can always update the database manually of course, which is probably smart to do. Do you propose to remove these checks and if so, why? Don't you think these checks add a layer of robustness to the app by verifying that the database is in synch with the study data when starting the application?

jthompson-arcus commented 1 month ago

I do think we should have the check in place. And db_update() does return early with the simple sync time check.

I think my question around it is really two-fold:

  1. As a process flow, it makes more sense to be running this in the pre-processing step as opposed to the run-time step. But maybe that is just a documentation problem because this can function mostly as a check.
  2. The update only runs when app_prod = TRUE and does so silently. This configuration has to be intentionally added because the default is FALSE and is easy to forget.

To a lesser extent I'm worried about initialization time, but this process here isn't really the culprit there.

jthompson-arcus commented 1 month ago

@LDSamson another thought is that the study_data is only used here and in get_appdata() in app_server.R. Relooking at our process can help with load times in the application. Right now the app_data object gets created for each user at runtime. As far as I can tell, we should get performance gains if the application relies on app_data instead of study_data.

LDSamson commented 3 weeks ago

Might be better indeed to rely on app_data instead. historically, the app relied on multiple objects being available (app_data, app_tables, app_vars, metadata), which was too complicated and led to the current situation. I don't think performance gain will be big, but it can help to only rely on app_data.

We can also simplify the in-memory metadata, since I think we are only using metadata$items_expanded and some metadata-derived app_vars in the application.

We should aim for better performance since the app is slow to start up and identify the biggest bottlenecks. Probably biggest performance gains can be achieved by improving the plotly functions, I think designing the figures by using native plotly function instead of using ggplot2 and then converting the figures with plotly::ggplotly().