Closed martinjrobins closed 2 months ago
Another example dataset, with separate columns for amount and observation units: https://github.com/pkpdapp-team/pkpdapp-datafiles/blob/main/usecase_monolix/TE_Data.txt
A few more points based on discussions in Basel on 16/02/24:
Here are some example data files from Michael:
Data File_pkpd explorer_02.csv Data File_pkpd explorer_05.csv Data File_pkpd explorer_04.csv Data File_pkpd explorer_03.csv Data File_pkpd explorer_06.csv Data File_pkpd explorer_01.csv
Some additional info: A few general rules/ considerations we discussed before:
Some more data to try Data File_pkpd explorer_multipleYTYPE.csv. This is from simulated data:
Species: Monkey PK model: 1-compt PK model + bioavailability PD model: direct effects Emax
Parameters: default except, CL = 0.8 mL/h/kg F = 0.9 C50 = 1,000,000 pmol/L Emax = 5
C1 mapped to YTYPE 1 and E mapped to YTYPE 2.
Some more todos on this from the meeting 12/4/24.
https://github.com/pkpdapp-team/pkpdapp/pull/388 splits the CSV validation into errors (where columns are required in order to continue) and warnings (where columns are missing but can be inferred rom the data or added later.) It also revalidates the CSV when headers change, rather than validating once after the initial upload. That should stop you from proceeding until errors have been dealt with eg. setting a time unit when one is missing.
If the CSV has no header row, you'll get errors saying that Time and Observation columns are missing, but the only way to clear those would be to upload a new CSV with headers.
looks excellent, thanks @eatyourgreens!
snag list for data upload:
Tab | Problem description | Step | Item | Proposed Fix |
---|---|---|---|---|
Data | No warning for conc/ obs units | 1 | DV/ conc/ obs | Provide additional text "Concentration units have not been defined in the dataset and need to be defined manually" |
Data | Unclear how ID/ groups are assigned | 1 | ID warning | Provide additional text "A new subject IDs is assigned according to time column, if time is provided in ascending order for each individual" |
Data | Unable to select a dosing compartment w/o model selection | 2 | Dosing compartment | Please provide error message if no PK model is selected, e.g., "please select a PK model" |
Data | Unable to select a variable w/o model selection | 3 | Map variable | Please provide error message if no PK model is selected, e.g., "please select a PK model" |
Data | Stratification occurs after this information is first required (step 2) | 4 | Stratification | Would this be better placed as the 2nd step, as dosing compartment may differ for different groups/ cohorts. |
Data | Unclear how to introduce a new group | 4 | Stratification | It works with hitting "enter" but is this how it's supposed to work? Not ideal for ipad, iphone. Suggest introducing a "confirm" button. |
Data | For datasets w/o dosing information, the dosing information taken from Trial Design is visible in Data/Protocols but the IDs need reworking | Final | Datafile | tbd |
Data | Subject ID is not provided in Data/ Observations | Final | Datafile | tbd |
Data | Not possible to export datafile | Final | Datafile | Include Export Datafile functionality |
General | App is quite a bit slower than before | tbd | ||
Data | Replacing a datafile disables "next" button. If you load a datafile and select the time units and then drag and drop a new datafile, the data itself is replaced but you can no longer proceed to the next stel | 1 | replace datafile | |
Data | Name of datafile not visible | all steps | it may be useful for the user to be able to document/ display the name of the datafile they uploaded | |
Simulations | Automatic axis scaling works on the simulated data, which may truncate the observations | - | adjust axis limits on the observed and simulated data | |
Trial design | remove group does not work | - | ||
Trial design | if an additional group is created (w/o data), the dosing protocol still appears in the datafile (dosing information) | - | Do we want to keep it this way? Con: when people export datafile for analysis in Monolix/etc. this information is redundant. Pro: It provides information of all scenarios investigated for reproduction in other software. | |
Simulations | By default only simulation 1 will be displayed when reloading the app, all other simulations are treated as temporary | - | Do we want to keep it this way? | |
Models | Once data is loaded, switching between models only works if the amount variables have the same name as the model to which the data were mapped. Works fine for all generic comp models, but does not work when switching to TMDD models or switching from a full TMDD to a QSS TMDD model due to differnt amount variable names. | - | If such a conflict arises, the user will need to remap the dosing compartment in Model and Data. | |
Data | Once a datafile is accepted, the only way to edit it is to upload and go through all the steps again. | Final | Can we add an "edit" button for users to change units or grouping etc. w/o having to redo all the steps? | |
Data | "Dimensionless" is not available for units selection | 3 | map an empty unit column entry to "dimensionless" | |
Data | descending order of YTYPEs | 3 | YTYPE order | change order of multiple YTYPES to ascending |
Data | Final | Upload New Dataset | provide warning "this action will delete the current dataset" | |
Data | data loader does not recognize if two columns are mapped to "Observation" | 1 | if multiple columns are mapped to "Observation" treat them like different YTYPEs | |
Data | 4 | Secondary grouping | unclear what that does |
I've pushed a few small fixes this morning:
The fix for missing subject IDs might also have fixed the 'Remove Group' button. At least, I'm seeing the Data tab correctly refresh now, after removing a group of subjects.
I think the Python model only recognises one Observation column at the moment, but we could maybe merge multiple observation columns into a single Observation column, with an Observation ID column to group observations by type.
In that case, I don’t think the CSV could have an Observation ID column. I think there are two mutually exclusive cases. We currently only support the first:
Exporting a CSV from the Data tab is close to being done. I might be able to finish that tomorrow.
Some more snags from the Roche team which I'll copy here. I've also included the "Fixed" column to indicate what I think has already been fixed (as far as I can see but you might want to check @eatyourgreens ):
Tab | Problem description | Step | Item | Proposed Fix | Fixed | Check | Comment |
---|---|---|---|---|---|---|---|
Data | No warning for conc/ obs units | 1 | DV/ conc/ obs | Provide additional text "Concentration units have not been defined in the dataset and need to be defined manually" | Yes | ||
Data | Unclear how ID/ groups are assigned | 1 | ID warning | Provide additional text "A new subject IDs is assigned according to time column, if time is provided in ascending order for each individual" | Yes | ||
Data | Unable to select a dosing compartment w/o model selection | 2 | Dosing compartment | Please provide error message if no PK model is selected, e.g., "please select a PK model" | Yes | data upload disabled until pk model is selected | |
Data | Unable to select a variable w/o model selection | 3 | Map variable | Please provide error message if no PK model is selected, e.g., "please select a PK model" | Yes | ||
Data | Stratification occurs after this information is first required (step 2) | 4 | Stratification | Would this be better placed as the 2nd step, as dosing compartment may differ for different groups/ cohorts. | Yes | ||
Data | Unclear how to introduce a new group | 4 | Stratification | It works with hitting "enter" but is this how it's supposed to work? Not ideal for ipad, iphone. Suggest introducing a "confirm" button. | Yes | ||
Data | For datasets w/o dosing information, the dosing information taken from Trial Design is visible in Data/Protocols but the IDs need reworking | Final | Datafile | tbd | what do you mean the id's need reworking? The id given in the Data/Protocols is the id of the particular protocol, the group # given in the title is associated with the "Group" column in the main dataset | ||
Data | Subject ID is not provided in Data/ Observations | Final | Datafile | tbd | Yes | ||
Data | Not possible to export datafile | Final | Datafile | Include Export Datafile functionality | |||
General | App is quite a bit slower than before | tbd | |||||
Data | Replacing a datafile disables "next" button. If you load a datafile and select the time units and then drag and drop a new datafile, the data itself is replaced but you can no longer proceed to the next stel | 1 | replace datafile | Yes | |||
Data | Name of datafile not visible | all steps | it may be useful for the user to be able to document/ display the name of the datafile they uploaded | ||||
Simulations | Automatic axis scaling works on the simulated data, which may truncate the observations | - | adjust axis limits on the observed and simulated data | Yes | |||
Trial design | remove group does not work | - | Yes | ||||
Trial design | if an additional group is created (w/o data), the dosing protocol still appears in the datafile (dosing information) | - | Do we want to keep it this way? Con: when people export datafile for analysis in Monolix/etc. this information is redundant. Pro: It provides information of all scenarios investigated for reproduction in other software. | ||||
Simulations | By default only simulation 1 will be displayed when reloading the app, all other simulations are treated as temporary | - | Do we want to keep it this way? | Yes | all simulations displayed on reload | ||
Models | Once data is loaded, switching between models only works if the amount variables have the same name as the model to which the data were mapped. Works fine for all generic comp models, but does not work when switching to TMDD models or switching from a full TMDD to a QSS TMDD model due to differnt amount variable names. | - | If such a conflict arises, the user will need to remap the dosing compartment in Model and Data. | ||||
Data | Once a datafile is accepted, the only way to edit it is to upload and go through all the steps again. | Final | Can we add an "edit" button for users to change units or grouping etc. w/o having to redo all the steps? | I think the export datafile will solve this, user can then export, edit manually, then reupload | |||
Data | "Dimensionless" is not available for units selection | 3 | map an empty unit column entry to "dimensionless" | Yes | |||
Data | descending order of YTYPEs | 3 | YTYPE order | change order of multiple YTYPES to ascending | Yes | ||
Data | Final | Upload New Dataset | provide warning "this action will delete the current dataset" | Yes | |||
Data | data loader does not recognize if two columns are mapped to "Observation" | 1 | if multiple columns are mapped to "Observation" treat them like different YTYPEs | This fix is incompatible with the Observation ID column, only works if this column is not provided. Can display error if multiple observation columns are provided and an observation id column? | |||
Data | 4 | Secondary grouping | unclear what that does | Is secondary grouping still useful to have (i.e. sort into groups based on 2 catagorical covariates)? Can either delete this, or provide a text description of what it does? | |||
Data | After upload of a datafile and setting the time units, if the user changes a column header, they cannot proceed to the next step and the time units error warning appears but time units can not be set. Reselection of the time unit activates the next button. | 1 | similar to line 12 | ||||
Data | Error message for dosing units states "amount units can be set in Trial Design", however, amount units are set on this page | 2 | Dosing units | remove this error message | |||
Data | next and back buttons should stay in one location to allow quick movement through the stepper | 1-4 | |||||
Data | Upload New Dataset | do we need this, why not start this page with the interface the user sees after pressing "Upload New Dataset" | |||||
Data | 2 | Automated mapping | could we automatically map dosing variables if "Route" information is provided? IV map to A1 variable (e.g., A1, A1_f, A1_t), SC or PO map to Aa | ||||
Data | unclear how to select another cat covariate as primary grouping, app always seems to just back to group | 4 | Primary grouping | ||||
Data | how are the ID assigned for the dosing protocol? | ||||||
Data | Infusion time = 0 in datasheet | It is likely that we will get data with infusion time = 0 (as Monolix and Nonmem accept those); for the pkpd explorer I suggest that if infusion time = 0, we autoatically set this to a very short time interval e.g., 30 seconds. | |||||
Data | Mutiple dosing with individual dosing events given as separate lines (see right) not recognised | App identifies that multiple doses are administered but does not match the time appropriately. Units of dosing are also incorrectly displayed in mg/kg in Trial Design. |
Data page will have 2 tabs:
Load Data
Stratification
Visualization