Open mworni opened 12 years ago
could you list the variables i should work with? i imagine they must be:
death death date age (continuous) emergency_surgery
just so that you know, my plan of attack is:
Not completely sure if I understand what you are asking me. Please check in the script the part with Table 1: Demographics. There are almost all important variables listed that are also used at a later point in the analysis.
death [postopdeath] death date [dayofdeath] ... there are a ton of missing values for those who didn't die... the number of missing values is exactly those you didn't die according to postopdeath age [numage] what John uses is deciles: [numage10yrcatstart39] emergency surgery - everybody had emergency surgery for 7 different diagnosis [diagnosis]
please let me know if you need more information
On Thu, Jun 28, 2012 at 9:17 PM, Ricardo Pietrobon < reply@reply.github.com
wrote:
could you list the variables i should work with? i imagine they must be:
death death date age (continuous) emergency_surgery
just so that you know, my plan of attack is:
- slice the data using ddply
- aggregate having sums of total number of dead patients across the different categories
- plot them
Reply to this email directly or view it on GitHub:
https://github.com/rpietro/NSQIPageComplications/issues/16#issuecomment-6638882
i wrote this really quickly and so there is a lot of refinement to be done here, but code is at the bottom. I would say that next steps are:
library(plyr)
id <- as.integer(c(1:100829)) trends <- count(id, c("nsqip.data$yearoperation", "nsqip.data$age_groups"))
names(trends) ggplot(trends, aes(nsqip.data.yearoperation, freq, colour = trends$nsqip.data.age_groups)) + geom_point(position = "jitter") + geom_point(aes(colour = trends$nsqip.data.age_groups)) + geom_smooth() + xlab("Years") + ylab("Number of patients")
detach() library(plyr) attach(nsqip.data) postopdeath <- as.integer(postopdeath) death.trends <- ddply(nsqip.data, c("yearoperation", "age_groups"), summarise, tot=sum(postopdeath)) detach() attach(death.trends)
names(death.trends) attach(death.trends) ggplot(death.trends, aes(death.trends$yearoperation, tot, colour = age_groups)) + geom_point(position = "jitter") + geom_point(aes(colour = age_groups)) + geom_smooth() + xlab("Years") + ylab("Number of deaths")
plot1 <- ggplot(death.trends, aes(death.trends$yearoperation, tot, colour = age_groups)) + geom_point(position = "jitter") + geom_point(aes(colour = age_groups)) + geom_smooth(method = "lm") + xlab("Years") + ylab("Number of deaths") ggsave(plot1, "deathsOverTime.jpg")
On Thu, Jun 28, 2012 at 5:58 PM, mworni < reply@reply.github.com
wrote:
Not completely sure if I understand what you are asking me. Please check in the script the part with Table 1: Demographics. There are almost all important variables listed that are also used at a later point in the analysis.
death [postopdeath] death date [dayofdeath] ... there are a ton of missing values for those who didn't die... the number of missing values is exactly those you didn't die according to postopdeath age [numage] what John uses is deciles: [numage10yrcatstart39] emergency surgery - everybody had emergency surgery for 7 different diagnosis [diagnosis]
please let me know if you need more information
On Thu, Jun 28, 2012 at 9:17 PM, Ricardo Pietrobon < reply@reply.github.com
wrote:
could you list the variables i should work with? i imagine they must be:
death death date age (continuous) emergency_surgery
just so that you know, my plan of attack is:
- slice the data using ddply
- aggregate having sums of total number of dead patients across the different categories
- plot them
Reply to this email directly or view it on GitHub:
https://github.com/rpietro/NSQIPageComplications/issues/16#issuecomment-6638882
Reply to this email directly or view it on GitHub:
https://github.com/rpietro/NSQIPageComplications/issues/16#issuecomment-6642947
This is point two of John's email where you thought to make some graphical exploratory analysis...