Closed yulric closed 1 year ago
1) center.csv - 'finalVariable' is the variable after centering, which may not actually be the final model variable, correct? 2) cox.csv - I assume that I can rename this file, depending on the type of model? 3) model-steps.csv - This file lists the order the files should be run, correct? So 'final' variables from the first file could be an input 'variable' in the second file- that would make sense to me 4) Should dummy variable specification be in another file? Or is it all specified in variable-details.csv? 5) variable-details.csv - do the columns that are currently empty need to be filled? If so, could you describe what you want in them? I understand the ones that are currently filled 6) variable-details.csv - what variables go in this file? Just variables that need to be dummied? I see you have age as an example, however. Just age? Or also age_c, age_rcs1, and age_rcs2? I suppose maybe it is any variable that should be 'checked' (for min, max ect)? Perhaps it would help if we could specify at what stage this file is used, for example in model-steps.csv? 7) variables.csv - what variables go in this file? all starting variables?
1. center.csv - 'finalVariable' is the variable after centering, which may not actually be the final model variable, correct? 2. cox.csv - I assume that I can rename this file, depending on the type of model? 3. model-steps.csv - This file lists the order the files should be run, correct? So 'final' variables from the first file could be an input 'variable' in the second file- that would make sense to me 4. Should dummy variable specification be in another file? Or is it all specified in variable-details.csv? 5. variable-details.csv - do the columns that are currently empty need to be filled? If so, could you describe what you want in them? I understand the ones that are currently filled 6. variable-details.csv - what variables go in this file? Just variables that need to be dummied? I see you have age as an example, however. Just age? Or also age_c, age_rcs1, and age_rcs2? I suppose maybe it is any variable that should be 'checked' (for min, max ect)? Perhaps it would help if we could specify at what stage this file is used, for example in model-steps.csv? 7. variables.csv - what variables go in this file? all starting variables?
1. center.csv - 'finalVariable' is the variable after centering, which may not actually be the final model variable, correct? 2. cox.csv - I assume that I can rename this file, depending on the type of model? 3. model-steps.csv - This file lists the order the files should be run, correct? So 'final' variables from the first file could be an input 'variable' in the second file- that would make sense to me 4. Should dummy variable specification be in another file? Or is it all specified in variable-details.csv? 5. variable-details.csv - do the columns that are currently empty need to be filled? If so, could you describe what you want in them? I understand the ones that are currently filled 6. variable-details.csv - what variables go in this file? Just variables that need to be dummied? I see you have age as an example, however. Just age? Or also age_c, age_rcs1, and age_rcs2? I suppose maybe it is any variable that should be 'checked' (for min, max ect)? Perhaps it would help if we could specify at what stage this file is used, for example in model-steps.csv? 7. variables.csv - what variables go in this file? all starting variables?
- Yes. Maybe it would be better to rename it to centeredVariable?
- The fileName does not really matter but the step column value does. For eg., for a logistic regression model we would call the step logistic and the fileName could be model.csv. For the last step, the step column should inform us what kind of the model it is and so what formula needs to be run to evaluate the model.
- Yes, the transformed variables from one could be inputs to the next file. Essentially, the variables that can be referenced in a file need to declared in some file before it.
- I'm not sure. Do you think there's some dummying info that is not being captured in the variables-details file? The one good reason to put the dummy info there is because we are also defining the catValues there, so its easy to map a catValue to a dummy variable.
- Ah sorry, those columns are part of the current variables-details file in cchsflow but I don't think we need them for the model export files. I'll remove them.
- All the variables defined in the variables.csv file need to be defined in the variables-details file.
- All starting variables yes.
I will start to fill these in for DemPoRT. Thanks!
Baseline hazard should go somewhere too. Not sure where would make the most sense. @yulric
Baseline hazard should go somewhere too. Not sure where would make the most sense. @yulric
I think in the cox step or in your case the fine-and-grey step. Should we have a row in that file for baseline hazard? I know that some algorithms the baseline hazard changes with time?
I've added @Rhan43 to this PR review, since she'll need to create oversee the creation of model export for RESPECT. Also included @amytmhsu in case she want to see what's going on.
Closing PR. All documentation is in the model parameters repo.
This vignette documents a way that external investigators can export their model using CSV files. Look at the file
vignettes/model-export.Rmd
.Things to discuss: