rpietro / airwayDehiscence

Airway dehiscence project using the UNOS database
0 stars 0 forks source link

Missing data for logistic regression model #16

Open acastleberry opened 11 years ago

acastleberry commented 11 years ago

I have a logistic regression model with some missing data for factor variables. R is treating the missing data as a separate category, but in this case I want it just to drop those records (which is what I expected it would do and I'm surprised this did not happen). Is there a reason why the logistic regression model is not dropping missing data? Is there a way to get it to do that? Do all the missing values for these factor variables (coded as closed quotes "") need to be recoded as NA?

rpietro commented 11 years ago

Tony, it was probably and importing problem for categorical variables (called factors in R), where it took missing values to be an empty string (represented by "", two quotes with nothing inside) rather than an NA. there are many options, but here are two for starters:

  1. redo the import and specify that they are indeed missing - if you're importing from csv there should be an option called na.string http://goo.gl/kOyk4 that will take care of that. if you're importing from another format, using the foreign package should do this automatically http://goo.gl/M37Os
  2. recode each variable -- this might be the easiest option, something like

dataname[dataname$variable==""] <- NA

where dataname is the object you imported the data set into, and variable is the variable where the missing data became "" (an empty string) rather than NA (a symbol for true missing data)

On Oct 2, 2012 4:18 PM, "acastleberry" notifications@github.com wrote:

I have a logistic regression model with some missing data for factor variables. R is treating the missing data as a separate category, but in this case I want it just to drop those records (which is what I expected it would do and I'm surprised this did not happen). Is there a reason why the logistic regression model is not dropping missing data? Is there a way to get it to do that? Do all the missing values for these factor variables (coded as closed quotes "") need to be recoded as NA?

— Reply to this email directly or view it on GitHubhttps://github.com/rpietro/airwayDehiscence/issues/16.

acastleberry commented 11 years ago

Thanks, that's helpful. It's a .csv file so I'll make the changes you mentioned. Also, is there a way for R to directly read SAS or JMP files? In looking online and through the help files everything I found is to first convert these files. I've been converting everything to .csv, but would be nice if there is a way for R to read SAS/JMP directly.

rpietro commented 11 years ago

the foreign package i sent you in one of my previous emails is really good, and you can read almost anything, including sas and stata (read.dta). but for jmp the export to csv is still the way to go, unless somebody designed something that i am not aware of

one interesting thing is that a new package came out recently where you can now run sas scripts (the commands, not the data) in R. Joao, did you test it?

On Wed, Oct 3, 2012 at 10:53 AM, acastleberry notifications@github.comwrote:

Thanks, that's helpful. It's a .csv file so I'll make the changes you mentioned. Also, is there a way for R to directly read SAS or JMP files? In looking online and through the help files everything I found is to first convert these files. I've been converting everything to .csv, but would be nice if there is a way for R to read SAS/JMP directly.

— Reply to this email directly or view it on GitHubhttps://github.com/rpietro/airwayDehiscence/issues/16#issuecomment-9108972.

rpietro commented 11 years ago

Not yet, I scheduled to try it on Friday. I'ltt get back to you with comments.

2012/10/3 Ricardo Pietrobon pietr007@gmail.com

the foreign package i sent you in one of my previous emails is really good, and you can read almost anything, including sas and stata (read.dta). but for jmp the export to csv is still the way to go, unless somebody designed something that i am not aware of

one interesting thing is that a new package came out recently where you can now run sas scripts (the commands, not the data) in R. Joao, did you test it?

On Wed, Oct 3, 2012 at 10:53 AM, acastleberry notifications@github.comwrote:

Thanks, that's helpful. It's a .csv file so I'll make the changes you mentioned. Also, is there a way for R to directly read SAS or JMP files? In looking online and through the help files everything I found is to first convert these files. I've been converting everything to .csv, but would be nice if there is a way for R to read SAS/JMP directly.

— Reply to this email directly or view it on GitHubhttps://github.com/rpietro/airwayDehiscence/issues/16#issuecomment-9108972.

Joao Ricardo N. Vissoci Tel. US - +19194916644 Tel. Brazil - +554499298078 Psicólogo CRP 08/12469 - Psychologist Prof. MSc. Faculdade Ingá Doutorando em Psicologia Social - PUCsp - Phd Candidate Social Psychology Grupo Pro-Esporte UEM/CNPq - Pro-Sport Research Group Núcleo de Estudos e Pesquisas sobre Identidade-Metamorfose - NEPIM/PUC/CNPq Research on Research Group - RoR - Duke University joaovissoci@gmail.com http://joaovissoci@gmail.com/jrvissoci@ig.com.br proesporteuem.blogspot.com.br researchonresearch.org