Biogen-Inc / tidyCDISC

Demo the app here: https://bit.ly/tidyCDISC_app
https://biogen-inc.github.io/tidyCDISC/
GNU Affero General Public License v3.0
106 stars 38 forks source link

"Recipes" should be metadata-driven #152

Open borgmaan opened 1 year ago

borgmaan commented 1 year ago

The logic used to define the relationship between a selection of Standard Analysis Table and a set of associated "stat blocks" (e.g. blockData params) is currently hard coded in JS. I think it would be wise to extract this information into a metadata structure that can be maintained external to the application code. This could start as a simple JSON file in inst/app/www/ but could be swapped for something more "formal" down the line (e.g. a DB).

Pulling this logic out of the code offers many immediate and future benefits including:

What are others thoughts on this?

AARON-CLARK commented 1 year ago

I’m away from my desk for a couple days but but before I forget, I wanted to comment with two thumbs up from me 👍🏽 👍🏽

Jeff-Thompson12 commented 1 year ago

@borgmaan Are you suggesting to make the blockData object itself a JSON object or to build the "recipes" themselves from a JSON?

borgmaan commented 1 year ago

@borgmaan Are you suggesting to make the blockData object itself a JSON object or to build the "recipes" themselves from a JSON?

@Jeff-Thompson12 -- just the recipes as a way to store sets of blockData that come together to make a specific output.

Grabbing an example from your recent PR on the new blockData generation mechanism, the goal would be to encode something like:

bd <- createBlockdata(datalist)
bd |>
  addBlock("AGEGR1", "FREQ") |>
  addBlock("AGE", "MEAN", "NONE") |>
  addBlock("SEX", "FREQ") |>
  addBlock("ETHNIC", "FREQ") |>
  addBlock("RACE", "FREQ") |>
  addBlock("HEIGHTBL", "MEAN", "NONE") |>
  addBlock("WEIGHTBL", "MEAN", "NONE")
bd

As a metadata structure that can be stored external to the application logic, e.g.:

{
    // ... other application stuff from the future...,
"table_recipes": {
  "table_1": {
    "title": "A great standard table",
    "preprocessing": "data_pre_process_fn",
    // ... other table stuff ...
    "analysis_blocks": {
        {"data":"ADSL", "variable": "AGEGR1", "statistic": "FREQ"}
        {"data":"ADSL", "variable": "AGE", "statistic": "MEAN", "args": ["NONE"]}
        {"data":"ADSL", "variable": "SEX", "statistic": "FREQ"}
        {"data":"ADSL", "variable": "ETHNIC", "statistic": "FREQ"}
        {"data":"ADSL", "variable": "RACE", "statistic": "FREQ"}
        {"data":"ADSL", "variable": "HEIGHTBL", "statistic": "MEAN", "args": ["NONE"]}
        {"data":"ADSL", "variable": "WEIGHTBL", "statistic": "MEAN", "args": ["NONE"]}
      } 
    } 
  }
}
AARON-CLARK commented 1 year ago
  • Adding a new table (Create more standard tables #142) simply involves updating the metadata.
  • Personalization becomes an option as users can save their custom table creations as a new Standard Analysis Table (you are just capturing and restoring metadata).

So obviously step 1 is getting this to work with a POC json file. There will be a lot of info to log in the json file (even IDEAFilter filters that are applied), especially with the our potential row_layout_options. So, do we work on #37 first or in parallel? Then there are a lot of factors that are introduced when it comes to giving users or developers the ability to edit/update the standard analysis metadata.

For example, we'd need an add & delete button for the user to edit the json file after building their custom analyses. The delete button would have to be conditional on if their table specs match the json file exactly, otherwise the button should disappear. However, instead of evaluating that after every change, delete could always be visible but provide error messaging: "Table with those specifications doesn't exist in the metadata".

If users have the ability to add/delete, perhaps we keep the tidyCDISC standard analysis default tlgs in a separate json file in case the user every wanted to perform a "factor reset"? Perhaps the developer deploying the app for their org needs an argument in run_app() that points to the json file(s) location?

Another question: do we want to allow users to upload an external metadata (json) file that can either replace or append the existing metadata? We'd have to build a bunch of validation checks into the upload widget to make sure everything is in the right place & prevent things from breaking. If we went that route, we'd need an upload and export button. I think this feature would be useful, especially when the user is just exploring the data on their own terms and doesn't want to impact the recipes for everyone on the study team using the app.

Thoughts? Questions? Criticisms?

Jeff-Thompson12 commented 1 year ago

So, do we work on #37 first or in parallel?

I think we should strive to do these in parallel since #37 is supposed to be an enhancement, we should just be able to integrate those components later.

The delete button would have to be conditional on if their table specs match the json file exactly, otherwise the button should disappear. However, instead of evaluating that after every change, delete could always be visible but provide error messaging: "Table with those specifications doesn't exist in the metadata".

I think we would want this functionality to be driven by the "recipe" dropdown. If a user selects a recipe, they can change the table using the UI and save those changes to the JSON. If a user does not select a recipe, they can create a table using the UI and then save that table to the RECIPE JSON file. But it would now be a part of the dropdown, so any edits would be driven by selecting it from the dropdown.

We'd have to build a bunch of validation checks into the upload widget to make sure everything is in the right place & prevent things from breaking.

I don't think we will be able to avoid needing to write these checks anyways. We already only show recipes if data is present. We would need to starting logging which tables are not being shown as options and why.

Jeff-Thompson12 commented 1 year ago

One problem on step 1 that I'm having a little trouble getting my head around is how to handle the javascript that is based on values passed from R (e.g the lab tables). If we want to specify preprocessing, this (in my opinion) creates a significant hurdle in allowing the user to create their own recipes.