rpietro / ecd_jama_renewal

0 stars 0 forks source link

recode... #4

Open mworni opened 11 years ago

mworni commented 11 years ago

Please check line 122 - 135 in the script...

PRIMARIY DIAGNOSIS @ TRANSPLANT

CrossTable(DGN_OSTXT_TCR, missing.include=TRUE) # recipient primary diagnosis specified @ transplant

CrossTable(DGN_TCR, missing.include=TRUE) # recipient primary diagnosis specified @ transplant DIAG <- "else""

class(DIAG) recode(DIAG, "c('3000', '3001', '3002', '3003')='Glomerulonephritis'; else='other'")

DIAG[DGN_TCR=3000] <- "Glomerulonephritis" DIAG[DGN_TCR=3001] <- "Glomerulonephritis" DIAG[DGN_TCR=3002] <- "Glomerulonephritis" DIAG[DGN_TCR=3003] <- "Glomerulonephritis" CrossTable(DIAG, missing.include=TRUE) class(DIAG)

I do not get an error message but it does not recode as I want... I have the variable DGN_TCR that has numbers and I would like to recode those numbers (representing some diseases) to a broader category. This variable should have the name DIAG.

Thanks...

rpietro commented 11 years ago

some thoughts:

check the class and make sure it converted to a factor. if not, then transform as.factor

also, make sure you are not attaching the data set as it won't recode. if you leave it non-attached, then use the syntax dataobject$variable

On Sat, Dec 8, 2012 at 8:46 AM, mworni notifications@github.com wrote:

Please check line 122 - 135 in the script... PRIMARIY DIAGNOSIS @ TRANSPLANT

CrossTable(DGN_OSTXT_TCR, missing.include=TRUE) # recipient primary

diagnosis specified @ transplant CrossTable(DGN_TCR, missing.include=TRUE) # recipient primary diagnosis specified @ transplant DIAG <- "else""

class(DIAG) recode(DIAG, "c('3000', '3001', '3002', '3003')='Glomerulonephritis'; else='other'")

DIAG[DGN_TCR=3000] <- "Glomerulonephritis" DIAG[DGN_TCR=3001] <- "Glomerulonephritis" DIAG[DGN_TCR=3002] <- "Glomerulonephritis" DIAG[DGN_TCR=3003] <- "Glomerulonephritis" CrossTable(DIAG, missing.include=TRUE) class(DIAG)

I do not get an error message but it does not recode as I want... I have the variable DGN_TCR that has numbers and I would like to recode those numbers (representing some diseases) to a broader category. This variable should have the name DIAG.

Thanks...

— Reply to this email directly or view it on GitHubhttps://github.com/rpietro/ecd_jama_renewal/issues/4.

mworni commented 11 years ago

as i remember it is a character and i did attach it... will change both!

would you or joao have a chance to make a template on how to recode different variables? i just talked with tony and he faces the same issue and decided not to recode anything in r. On Dec 8, 2012 4:35 PM, "Ricardo Pietrobon" notifications@github.com wrote:

some thoughts:

check the class and make sure it converted to a factor. if not, then transform as.factor

also, make sure you are not attaching the data set as it won't recode. if you leave it non-attached, then use the syntax dataobject$variable

On Sat, Dec 8, 2012 at 8:46 AM, mworni notifications@github.com wrote:

Please check line 122 - 135 in the script... PRIMARIY DIAGNOSIS @ TRANSPLANT

CrossTable(DGN_OSTXT_TCR, missing.include=TRUE) # recipient primary

diagnosis specified @ transplant CrossTable(DGN_TCR, missing.include=TRUE) # recipient primary diagnosis specified @ transplant DIAG <- "else""

class(DIAG) recode(DIAG, "c('3000', '3001', '3002', '3003')='Glomerulonephritis'; else='other'")

DIAG[DGN_TCR=3000] <- "Glomerulonephritis" DIAG[DGN_TCR=3001] <- "Glomerulonephritis" DIAG[DGN_TCR=3002] <- "Glomerulonephritis" DIAG[DGN_TCR=3003] <- "Glomerulonephritis" CrossTable(DIAG, missing.include=TRUE) class(DIAG)

I do not get an error message but it does not recode as I want... I have the variable DGN_TCR that has numbers and I would like to recode those numbers (representing some diseases) to a broader category. This variable should have the name DIAG.

Thanks...

— Reply to this email directly or view it on GitHub< https://github.com/rpietro/ecd_jama_renewal/issues/4>.

— Reply to this email directly or view it on GitHubhttps://github.com/rpietro/ecd_jama_renewal/issues/4#issuecomment-11159773.

mworni commented 11 years ago

It still does not work... any other suggestion?

ecddata$diag <- DGN_TCR class(ecddata$diag) [1] "factor" ecddata$diag <- DGN_TCR class(ecddata$diag) [1] "integer" ecddata$diag <- as.factor(ecd.data$diag) class(ecddata$diag) [1] "factor" recode(ecddata$diag, "c('3000', '3001', '3002', '3003')='Glomerulonephritis'; else='other'") Error in recode.default(ecddata$diag, "c('3000', '3001', '3002', '3003')='Glomerulonephritis'; else='other'") : object '.data' not found ecddata$diag[ecddata$diag="3000"] <- "Glomerulonephritis" Error: unexpected '=' in "ecddata$diag[ecddata$diag=" ecddata$diag[ecddata$diag='3001'] <- "Glomerulonephritis" Error: unexpected '=' in "ecddata$diag[ecddata$diag=" ecddata$diag[ecddata$diag=3002] <- "Glomerulonephritis" Error: unexpected '=' in "ecddata$diag[ecddata$diag=" ecddata$diag[DGN_TCR=3003] <- "Glomerulonephritis" Warning message: In [<-.factor(*tmp*, DGN_TCR = 3003, value = c(NA, NA, NA, NA, : invalid factor level, NAs generated

On Sat, Dec 8, 2012 at 4:43 PM, Mathias Worni mathias.worni@duke.eduwrote:

as i remember it is a character and i did attach it... will change both!

would you or joao have a chance to make a template on how to recode different variables? i just talked with tony and he faces the same issue and decided not to recode anything in r. On Dec 8, 2012 4:35 PM, "Ricardo Pietrobon" notifications@github.com wrote:

some thoughts:

check the class and make sure it converted to a factor. if not, then transform as.factor

also, make sure you are not attaching the data set as it won't recode. if you leave it non-attached, then use the syntax dataobject$variable

On Sat, Dec 8, 2012 at 8:46 AM, mworni notifications@github.com wrote:

Please check line 122 - 135 in the script... PRIMARIY DIAGNOSIS @ TRANSPLANT

CrossTable(DGN_OSTXT_TCR, missing.include=TRUE) # recipient primary

diagnosis specified @ transplant CrossTable(DGN_TCR, missing.include=TRUE) # recipient primary diagnosis specified @ transplant DIAG <- "else""

class(DIAG) recode(DIAG, "c('3000', '3001', '3002', '3003')='Glomerulonephritis'; else='other'")

DIAG[DGN_TCR=3000] <- "Glomerulonephritis" DIAG[DGN_TCR=3001] <- "Glomerulonephritis" DIAG[DGN_TCR=3002] <- "Glomerulonephritis" DIAG[DGN_TCR=3003] <- "Glomerulonephritis" CrossTable(DIAG, missing.include=TRUE) class(DIAG)

I do not get an error message but it does not recode as I want... I have the variable DGN_TCR that has numbers and I would like to recode those numbers (representing some diseases) to a broader category. This variable should have the name DIAG.

Thanks...

— Reply to this email directly or view it on GitHub< https://github.com/rpietro/ecd_jama_renewal/issues/4>.

— Reply to this email directly or view it on GitHubhttps://github.com/rpietro/ecd_jama_renewal/issues/4#issuecomment-11159773.

rpietro commented 11 years ago

absolutely, was just thinking about it. btw, if your changes don't work i will get into your code and change it (didn't do it since i was on that meeting with elves). i will post it later today or tomorrow and send you the link.

copying Danny, Phillip, and Tony as that might be of help since they are getting into the whole recoding thing. trust me, it's simple, you just have to do the exact same thing over and over (data management is boring)

On Sat, Dec 8, 2012 at 10:43 AM, mworni notifications@github.com wrote:

would you or joao have a chance to make a template on how to recode

rpietro commented 11 years ago

on my car and so can't get into your code right now, but you forgot to use ecddata$DGN_TCR

remember, every time you have a problem, you need to look at the data in a spreadsheet format, that will almost always make the problem obvious. here is one option: http://goo.gl/NrnJc . Joao, what is the name of that other function?

On Sat, Dec 8, 2012 at 12:41 PM, mworni notifications@github.com wrote:

It still does not work... any other suggestion?

ecddata$diag <- DGN_TCR class(ecddata$diag) [1] "factor" ecddata$diag <- DGN_TCR class(ecddata$diag) [1] "integer" ecddata$diag <- as.factor(ecd.data$diag) class(ecddata$diag) [1] "factor" recode(ecddata$diag, "c('3000', '3001', '3002', '3003')='Glomerulonephritis'; else='other'") Error in recode.default(ecddata$diag, "c('3000', '3001', '3002', '3003')='Glomerulonephritis'; else='other'") : object '.data' not found ecddata$diag[ecddata$diag="3000"] <- "Glomerulonephritis" Error: unexpected '=' in "ecddata$diag[ecddata$diag=" ecddata$diag[ecddata$diag='3001'] <- "Glomerulonephritis" Error: unexpected '=' in "ecddata$diag[ecddata$diag=" ecddata$diag[ecddata$diag=3002] <- "Glomerulonephritis" Error: unexpected '=' in "ecddata$diag[ecddata$diag=" ecddata$diag[DGN_TCR=3003] <- "Glomerulonephritis" Warning message: In [<-.factor(*tmp*, DGN_TCR = 3003, value = c(NA, NA, NA, NA, : invalid factor level, NAs generated

On Sat, Dec 8, 2012 at 4:43 PM, Mathias Worni mathias.worni@duke.eduwrote:

as i remember it is a character and i did attach it... will change both!

would you or joao have a chance to make a template on how to recode different variables? i just talked with tony and he faces the same issue and decided not to recode anything in r. On Dec 8, 2012 4:35 PM, "Ricardo Pietrobon" notifications@github.com wrote:

some thoughts:

check the class and make sure it converted to a factor. if not, then transform as.factor

also, make sure you are not attaching the data set as it won't recode. if you leave it non-attached, then use the syntax dataobject$variable

On Sat, Dec 8, 2012 at 8:46 AM, mworni notifications@github.com wrote:

Please check line 122 - 135 in the script... PRIMARIY DIAGNOSIS @ TRANSPLANT

CrossTable(DGN_OSTXT_TCR, missing.include=TRUE) # recipient primary

diagnosis specified @ transplant CrossTable(DGN_TCR, missing.include=TRUE) # recipient primary diagnosis specified @ transplant DIAG <- "else""

class(DIAG) recode(DIAG, "c('3000', '3001', '3002', '3003')='Glomerulonephritis'; else='other'")

DIAG[DGN_TCR=3000] <- "Glomerulonephritis" DIAG[DGN_TCR=3001] <- "Glomerulonephritis" DIAG[DGN_TCR=3002] <- "Glomerulonephritis" DIAG[DGN_TCR=3003] <- "Glomerulonephritis" CrossTable(DIAG, missing.include=TRUE) class(DIAG)

I do not get an error message but it does not recode as I want... I have the variable DGN_TCR that has numbers and I would like to recode those numbers (representing some diseases) to a broader category. This variable should have the name DIAG.

Thanks...

— Reply to this email directly or view it on GitHub< https://github.com/rpietro/ecd_jama_renewal/issues/4>.

— Reply to this email directly or view it on GitHub< https://github.com/rpietro/ecd_jama_renewal/issues/4#issuecomment-11159773>.

— Reply to this email directly or view it on GitHubhttps://github.com/rpietro/ecd_jama_renewal/issues/4#issuecomment-11161195.

rpietro commented 11 years ago

here you go: http://goo.gl/0MJPE

i added a number of comments, one of them explaining why what you were doing didn't work. don't forget that you have to have the car package installed to make this gist it work. btw, gists are a great way to post examples, exercises, etc

On Sat, Dec 8, 2012 at 1:56 PM, Ricardo Pietrobon pietr007@gmail.comwrote:

absolutely, was just thinking about it. btw, if your changes don't work i will get into your code and change it (didn't do it since i was on that meeting with elves). i will post it later today or tomorrow and send you the link.

copying Danny, Phillip, and Tony as that might be of help since they are getting into the whole recoding thing. trust me, it's simple, you just have to do the exact same thing over and over (data management is boring)

On Sat, Dec 8, 2012 at 10:43 AM, mworni notifications@github.com wrote:

would you or joao have a chance to make a template on how to recode

acastleberry commented 11 years ago

The overwhelming majority of data cleanup, recoding, etc., I do in JMP and then the models in R. I do some fine-tuning and recoding in R as I design the models. Here is some example recoding in R for one of the UNOS models:

VENTILATOR_TRR<-factor(VENTILATOR_TRR) #Make data type nominal MED_COND_TRR<-factor(MED_COND_TRR) #Make data type nominal AA_ETHCAT_MISMATCH <- factor(AA_ETHCAT_MISMATCH) #Make data type nominal HIST_COCAINE_DON <- relevel(HIST_COCAINE_DON, ref="N") #Make "N" the reference rcd_MED_COND_TRR <- recode(MED_COND_TRR, " '3' = '0Home'; '1'='ICU'; '2'='Hosp'") #Make "home" the reference AA_DIAG_LTX_RECODE <- relevel(AA_DIAG_LTX_RECODE, ref="Obstructive") #Make COPD the reference AA_PROCEDURE_TY <- relevel(AA_PROCEDURE_TY, ref="S") #Make single lung tx the reference AGE_BY_5 <-(AGE/5) #So that the OR is in 5-year increments DAYSWAIT_BY_90 <- (DAYSWAIT_CHRON/90) #So that the OR is in 90-day increments VOL_BY_10_PER_YEAR <-(AA_CTR_VOLUME_BRST/14.5/10) #So that the OR is per 10 tx per year AGE_DON_BY_5 <- (AGE_DON/5) #So that the OR is in 5-year increments IMP_PO2_BY_50 <- (AA_IMP_PO2/50) #So that the OR is in 50-unit increments rcd_HIST_CIG_DON <- recode(HIST_CIG_DON, " 'U' = 'N'") #Recode "U" as "N" for n=54 rcd_HIST_CIG_DON<-recode(rcd_HIST_CIG_DON, "''='N'") #Recode null as "N" for n=11 rcd_HIST_COCAINE_DON <- recode(HIST_COCAINE_DON,"'' = NA") #Recode mising ('') as NA so they get dropped rcd_DIABETES_DON <- recode(DIABETES_DON, " 'U' = 'N'") #Recode "U" as "N" for n=21 rcd_DIABETES_DON<-recode(rcd_DIABETES_DON, "''='N'") #Recode null as "N" for n=11

rpietro commented 11 years ago

and here is yet another example so that you can use a graphical user interface (GUI) in R to create variables and run data management tasks while still generating the corresponding scripts to keep your code reproducible: http://goo.gl/hq56l

for download and install: http://goo.gl/7HFBO

On Sun, Dec 9, 2012 at 2:47 PM, acastleberry notifications@github.comwrote:

The overwhelming majority of data cleanup, recoding, etc., I do in JMP and then the models in R. I do some fine-tuning and recoding in R as I design the models. Here is some example recoding in R for one of the UNOS models:

VENTILATOR_TRR<-factor(VENTILATOR_TRR) #Make data type nominal MED_COND_TRR<-factor(MED_COND_TRR) #Make data type nominal AA_ETHCAT_MISMATCH <- factor(AA_ETHCAT_MISMATCH) #Make data type nominal HIST_COCAINE_DON <- relevel(HIST_COCAINE_DON, ref="N") #Make "N" the reference rcd_MED_COND_TRR <- recode(MED_COND_TRR, " '3' = '0Home'; '1'='ICU'; '2'='Hosp'") #Make "home" the reference AA_DIAG_LTX_RECODE <- relevel(AA_DIAG_LTX_RECODE, ref="Obstructive") #Make COPD the reference AA_PROCEDURE_TY <- relevel(AA_PROCEDURE_TY, ref="S") #Make single lung tx the reference AGE_BY_5 <-(AGE/5) #So that the OR is in 5-year increments DAYSWAIT_BY_90 <- (DAYSWAIT_CHRON/90) #So that the OR is in 90-day increments VOL_BY_10_PER_YEAR <-(AA_CTR_VOLUME_BRST/14.5/10) #So that the OR is per 10 tx per year AGE_DON_BY_5 <- (AGE_DON/5) #So that the OR is in 5-year increments IMP_PO2_BY_50 <- (AA_IMP_PO2/50) #So that the OR is in 50-unit increments rcd_HIST_CIG_DON <- recode(HIST_CIG_DON, " 'U' = 'N'") #Recode "U" as "N" for n=54 rcd_HIST_CIG_DON<-recode(rcd_HIST_CIG_DON, "''='N'") #Recode null as "N" for n=11 rcd_HIST_COCAINE_DON <- recode(HIST_COCAINE_DON,"'' = NA") #Recode mising ('') as NA so they get dropped rcd_DIABETES_DON <- recode(DIABETES_DON, " 'U' = 'N'") #Recode "U" as "N" for n=21 rcd_DIABETES_DON<-recode(rcd_DIABETES_DON, "''='N'") #Recode null as "N" for n=11

— Reply to this email directly or view it on GitHubhttps://github.com/rpietro/ecd_jama_renewal/issues/4#issuecomment-11174942.