CBIIT / R-cometsAnalytics

R package development for COMETS Analytics
12 stars 10 forks source link

COMETS 1.3. Update sample and template file #20

Closed steven-moore closed 6 years ago

steven-moore commented 6 years ago

A number of changes need to be made to the test/sample file, including:

  1. Adding a column for continuous/categorical variables
  2. Adding a new value to the models tab for the age_grp variable (age<20 years), so that analyses can be stratified for the youngest participants
  3. Adding BMI as a continuous variable
  4. Adding models for the BMI analysis to the models tab (per R. Kelly), and cleanly distinguishing these models so that cohorts can delete if they elect not to participate.
ellatemprosa commented 6 years ago

so to clarify, we will add obesity project analyses to template 2, the age template. please confirm.

steven-moore commented 6 years ago

Yes, the obesity project should be included too. I have a set of 10 cohorts already lined up willing to do the age and BMI analyses together.

ellatemprosa commented 6 years ago

each project will get their own project prefix so that analyses can be run

steven-moore commented 6 years ago

The sample file that is downloadable from COMETS-Analytics should have a naming convention so that we can distinguish current analyses from past ones

steven-moore commented 6 years ago

Last column: values of "categorical" or "continuous"

steven-moore commented 6 years ago

cometsInput_12_22_2017.xlsx

The updated Sample Input file is attached. It includes all the recommended changes, and has been tested in interactive mode, batch mode, and super batch mode. I recommend that the download file from the COMETS website should now include the date as part of its name--for now this can act as a naming convention to help us distinguish different versions.

The file is currently being checked by Rachel Kelly to ensure that it meets all the specs for her BMI analysis.

steven-moore commented 6 years ago

For people who like a lot of detail, here's the relevant e-mail to Rachel Kelly:

Hi Rachel,

Good news! I met with my Age Working Group yesterday and they agreed to conduct your BMI analysis concurrent with our age WG analysis. We have approximately 10 different datasets (8 cohorts) already lined up and ready. In preparation for the analysis, I took the models that you developed and added them to our sample file (see attached). Participating cohorts will use the sample file to develop their own dataset.

Before we send out the new sample file to the different studies, I wanted you to scan the models in the sample file to ensure that they represent what you want. I made some minor changes to the models, as follows:

  1. Changed descriptive names to better distinguish between the age and BMI analysis
  2. Added “nested case” to most models
  3. Added age-stratified analyses that also adjust for age as a continuous variable (to protect against residual confounding within the rather large age categories). The original age-stratified models are also retained.

COULD YOU CONFIRM THAT THESE MODELS ARE CORRECT BY JANUARY 2, 2018?

Bear in mind that identical models will be used for all studies—there is no customization by cohort. So, if there is anything you can think of that needs to be added, this is your best chance.

NEXT STEPS (can wait until after January 2nd):

Once we agree on the models, we will replace the sample file on the COMETS website with this new one and encourage studies to set up their data and run analyses. My current plan is to split the work as follows: Cristina Menni will help new cohorts prep their data and run models. You and I will split the five or so cohorts (including VDAART and CAMP) that have already set up their datasets and encourage them to rerun with the BMI models now added.

For these cohorts that have already set up their data, you and I may need to walk them through a few modifications, as follows:

  1. Recode age group to allow for more age categories
  2. Recode race groups to include more categories (this will be important for your analysis, I believe)
  3. Add a continuous measure of BMI to their study. Add a row to the varmap for this variable. Rename the variable formerly called “BMI” to “bmi_grp”. Modify the row in varmap for the grouped BMI variable.
  4. Add a column to the varmap for “CODING” to differentiate between continuous and categorical variables
  5. Replace the old models with the new ones. Provided that changes above have been made, the old table can just be replaced wholesale.

Ok, that’s it for now, but wanted to pass along the good news and get things lined up for when we return.

All my best, Steve

steven-moore commented 6 years ago

Rachel approves of sample file, provided we resolve one last issue: should we more finely divide up the under age 20y category? One possibility: 0-8y, 9-14y, 15-19y

ellatemprosa commented 6 years ago

can we use continuous vs categorical designation for CODING

ellatemprosa commented 6 years ago

maybe VARTYPE?

steven-moore commented 6 years ago

So, rename "CODING" as "VARTYPE" and change the input value of "non-categorical" to "continuous"? That makes perfect sense to me.

ellatemprosa commented 6 years ago

thanks steve. the age coding makes sense, does she expect<5 years?, maybe those should be more fine tuned as well

steven-moore commented 6 years ago

I checked in with Rachel this week, and she had only one minor revision (to include "study_site" as a variable to most models and to the varmap). We decided to leave age groups as is because few have participants less than 20 years of age. I made these changes and the final sample input file for COMETS 1.3 is attached below. This issue is closed.

cometsInput_March_2018.xlsx