mworni commented 12 years ago

emergency surgery in old people accounts for a large proportion of surgical deaths, and therefore represents optimal target for performance improvement initiatives that are designed to have maximum possible effect in reducing surgical mortality (will add this data)

This is point two of John's email where you thought to make some graphical exploratory analysis...

rpietro commented 12 years ago

could you list the variables i should work with? i imagine they must be:

death death date age (continuous) emergency_surgery

just so that you know, my plan of attack is:

slice the data using ddply
aggregate having sums of total number of dead patients across the different categories
plot them

mworni commented 12 years ago

Not completely sure if I understand what you are asking me. Please check in the script the part with Table 1: Demographics. There are almost all important variables listed that are also used at a later point in the analysis.

death [postopdeath] death date [dayofdeath] ... there are a ton of missing values for those who didn't die... the number of missing values is exactly those you didn't die according to postopdeath age [numage] what John uses is deciles: [numage10yrcatstart39] emergency surgery - everybody had emergency surgery for 7 different diagnosis [diagnosis]

please let me know if you need more information

On Thu, Jun 28, 2012 at 9:17 PM, Ricardo Pietrobon < reply@reply.github.com

wrote:

could you list the variables i should work with? i imagine they must be:

death death date age (continuous) emergency_surgery

just so that you know, my plan of attack is:

slice the data using ddply

aggregate having sums of total number of dead patients across the different categories

plot them

Reply to this email directly or view it on GitHub:

https://github.com/rpietro/NSQIPageComplications/issues/16#issuecomment-6638882

rpietro commented 12 years ago

i wrote this really quickly and so there is a lot of refinement to be done here, but code is at the bottom. I would say that next steps are:

fix the age categorization
create better indices, such percentage of death within each category
apply this to other complications
adjust the regression lines -- for simplicity i used lm for a count variable, but we can fix that later

library(plyr)

id <- as.integer(c(1:100829)) trends <- count(id, c("nsqip.data$yearoperation", "nsqip.data$age_groups"))

View(trends)

names(trends) ggplot(trends, aes(nsqip.data.yearoperation, freq, colour = trends$nsqip.data.age_groups)) + geom_point(position = "jitter") + geom_point(aes(colour = trends$nsqip.data.age_groups)) + geom_smooth() + xlab("Years") + ylab("Number of patients")

detach() library(plyr) attach(nsqip.data) postopdeath <- as.integer(postopdeath) death.trends <- ddply(nsqip.data, c("yearoperation", "age_groups"), summarise, tot=sum(postopdeath)) detach() attach(death.trends)

View(death.trends)

names(death.trends) attach(death.trends) ggplot(death.trends, aes(death.trends$yearoperation, tot, colour = age_groups)) + geom_point(position = "jitter") + geom_point(aes(colour = age_groups)) + geom_smooth() + xlab("Years") + ylab("Number of deaths")

plot1 <- ggplot(death.trends, aes(death.trends$yearoperation, tot, colour = age_groups)) + geom_point(position = "jitter") + geom_point(aes(colour = age_groups)) + geom_smooth(method = "lm") + xlab("Years") + ylab("Number of deaths") ggsave(plot1, "deathsOverTime.jpg")

On Thu, Jun 28, 2012 at 5:58 PM, mworni < reply@reply.github.com

wrote:

Not completely sure if I understand what you are asking me. Please check in the script the part with Table 1: Demographics. There are almost all important variables listed that are also used at a later point in the analysis.

death [postopdeath] death date [dayofdeath] ... there are a ton of missing values for those who didn't die... the number of missing values is exactly those you didn't die according to postopdeath age [numage] what John uses is deciles: [numage10yrcatstart39] emergency surgery - everybody had emergency surgery for 7 different diagnosis [diagnosis]

please let me know if you need more information

On Thu, Jun 28, 2012 at 9:17 PM, Ricardo Pietrobon < reply@reply.github.com

wrote:

could you list the variables i should work with? i imagine they must be:

death death date age (continuous) emergency_surgery

just so that you know, my plan of attack is:

slice the data using ddply

aggregate having sums of total number of dead patients across the different categories

plot them

Reply to this email directly or view it on GitHub:

https://github.com/rpietro/NSQIPageComplications/issues/16#issuecomment-6638882

Reply to this email directly or view it on GitHub:

https://github.com/rpietro/NSQIPageComplications/issues/16#issuecomment-6642947

rpietro / NSQIPageComplications

Graphical exploratory analysis... #16

View(trends)

View(death.trends)