rpietro / NSQIPageComplications

Analysis of surgical complications using the NSQIP data set
1 stars 1 forks source link

Graphical exploratory analysis... #16

Open mworni opened 12 years ago

mworni commented 12 years ago
  1. emergency surgery in old people accounts for a large proportion of surgical deaths, and therefore represents optimal target for performance improvement initiatives that are designed to have maximum possible effect in reducing surgical mortality (will add this data)

This is point two of John's email where you thought to make some graphical exploratory analysis...

rpietro commented 12 years ago

could you list the variables i should work with? i imagine they must be:

death death date age (continuous) emergency_surgery

just so that you know, my plan of attack is:

  1. slice the data using ddply
  2. aggregate having sums of total number of dead patients across the different categories
  3. plot them
mworni commented 12 years ago

Not completely sure if I understand what you are asking me. Please check in the script the part with Table 1: Demographics. There are almost all important variables listed that are also used at a later point in the analysis.

death [postopdeath] death date [dayofdeath] ... there are a ton of missing values for those who didn't die... the number of missing values is exactly those you didn't die according to postopdeath age [numage] what John uses is deciles: [numage10yrcatstart39] emergency surgery - everybody had emergency surgery for 7 different diagnosis [diagnosis]

please let me know if you need more information

On Thu, Jun 28, 2012 at 9:17 PM, Ricardo Pietrobon < reply@reply.github.com

wrote:

could you list the variables i should work with? i imagine they must be:

death death date age (continuous) emergency_surgery

just so that you know, my plan of attack is:

  1. slice the data using ddply
  2. aggregate having sums of total number of dead patients across the different categories
  3. plot them

Reply to this email directly or view it on GitHub:

https://github.com/rpietro/NSQIPageComplications/issues/16#issuecomment-6638882

rpietro commented 12 years ago

i wrote this really quickly and so there is a lot of refinement to be done here, but code is at the bottom. I would say that next steps are:

  1. fix the age categorization
  2. create better indices, such percentage of death within each category
  3. apply this to other complications
  4. adjust the regression lines -- for simplicity i used lm for a count variable, but we can fix that later

library(plyr)

id <- as.integer(c(1:100829)) trends <- count(id, c("nsqip.data$yearoperation", "nsqip.data$age_groups"))

View(trends)

names(trends) ggplot(trends, aes(nsqip.data.yearoperation, freq, colour = trends$nsqip.data.age_groups)) + geom_point(position = "jitter") + geom_point(aes(colour = trends$nsqip.data.age_groups)) + geom_smooth() + xlab("Years") + ylab("Number of patients")

detach() library(plyr) attach(nsqip.data) postopdeath <- as.integer(postopdeath) death.trends <- ddply(nsqip.data, c("yearoperation", "age_groups"), summarise, tot=sum(postopdeath)) detach() attach(death.trends)

View(death.trends)

names(death.trends) attach(death.trends) ggplot(death.trends, aes(death.trends$yearoperation, tot, colour = age_groups)) + geom_point(position = "jitter") + geom_point(aes(colour = age_groups)) + geom_smooth() + xlab("Years") + ylab("Number of deaths")

plot1 <- ggplot(death.trends, aes(death.trends$yearoperation, tot, colour = age_groups)) + geom_point(position = "jitter") + geom_point(aes(colour = age_groups)) + geom_smooth(method = "lm") + xlab("Years") + ylab("Number of deaths") ggsave(plot1, "deathsOverTime.jpg")

On Thu, Jun 28, 2012 at 5:58 PM, mworni < reply@reply.github.com

wrote:

Not completely sure if I understand what you are asking me. Please check in the script the part with Table 1: Demographics. There are almost all important variables listed that are also used at a later point in the analysis.

death [postopdeath] death date [dayofdeath] ... there are a ton of missing values for those who didn't die... the number of missing values is exactly those you didn't die according to postopdeath age [numage] what John uses is deciles: [numage10yrcatstart39] emergency surgery - everybody had emergency surgery for 7 different diagnosis [diagnosis]

please let me know if you need more information

On Thu, Jun 28, 2012 at 9:17 PM, Ricardo Pietrobon < reply@reply.github.com

wrote:

could you list the variables i should work with? i imagine they must be:

death death date age (continuous) emergency_surgery

just so that you know, my plan of attack is:

  1. slice the data using ddply
  2. aggregate having sums of total number of dead patients across the different categories
  3. plot them

Reply to this email directly or view it on GitHub:

https://github.com/rpietro/NSQIPageComplications/issues/16#issuecomment-6638882


Reply to this email directly or view it on GitHub:

https://github.com/rpietro/NSQIPageComplications/issues/16#issuecomment-6642947