USCCANA / netdiffuseR

netdiffuseR: Analysis of Diffusion and Contagion Processes on Networks
https://USCCANA.github.io/netdiffuseR
Other
85 stars 21 forks source link

Issues in Medical Innovation data #33

Open gvegayon opened 2 years ago

gvegayon commented 2 years ago

Reported by Christophe Van den Bulte

Obs 1:

Hi Tom,

I am afraid there is a ‘mistake’ in the Medical Innovation vignette application https://cran.r-project.org/web/packages/netdiffuseR/vignettes/analyzing-medical-innovation-data.html

The 16 physicians with adoption time = 18 actually do not adopt in month 18. Instead, the value “18” is used to identify physicians who still had not adopted by the end of month 17, i.e., physicians who are right-censored at time 17.

This ‘mistake’ will not affect the finding whether or not there is contagion: the coefficient of the period 18 dummy is essentially infinity, making the other coefficients insensitive to those 16 physician-month observations.

Even so, I think it be better to re-do that analysis excluding month 18 and those 16 person-months from the data because …

(1) It does not ‘look right’, and

(2) without the period 18 dummy, the evidence of contagion would be inflated.

Christophe

Obs 2:

For the hazard analyses …

  1. One must keep the non-adopters in the data. Not doing so creates a truncation bias and spurious contagion in a hazard model. https://www.jstor.org/stable/23011998

  2. But one must take into account that they are non-adopters. Given the way the data are currently (and correctly) set up as “pseudo-panel” for the discrete-time hazard model, that simply means (i) deleting all periods where time > 17 from the data prior to estimation and (ii) deleting the “period 18” dummy from the model since it’s now always 0.

It is not clear t me how to handle censored observations (non-adopters) in the “threshold” analyses, but what I describe above is correct for the cumulative adoption plot, the hazard rate plot, and the hazard model.