Open alexploner opened 1 year ago
@alexploner :
Belatedly, the reason that the two models are different is that I thought it was a good idea to include the intercept term in the baseline time function. For rstpm2::stpm2
, I used intercept=FALSE
for the splines, whereas I use intercept=TRUE
for aft()
.
One consequence is that the factors do not know about the intercept and that we get two intercepts. Interestingly, the model is full rank (does this make sense?), but we get strong correlations:
library(rstpm2)
summary(fit1 <- aft(Surv(rectime,censrec==1)~hormon,data=rstpm2::brcancer,df=4))
summary(fit2 <- aft(Surv(rectime,censrec==1)~factor(hormon),data=rstpm2::brcancer,df=4))
vcov(fit1) |> cov2cor() |> "rownames<-"(NULL) |> "colnames<-"(NULL)
vcov(fit2) |> cov2cor() |> "rownames<-"(NULL) |> "colnames<-"(NULL)
> vcov(fit1) |> cov2cor() |> "rownames<-"(NULL) |> "colnames<-"(NULL)
[,1] [,2] [,3] [,4] [,5]
[1,] 1.00000000 0.03381084 -0.2649664 0.4061007 -0.2745472
[2,] 0.03381084 1.00000000 0.6129103 -0.6198557 0.6962407
[3,] -0.26496641 0.61291030 1.0000000 -0.8911680 0.8973204
[4,] 0.40610066 -0.61985566 -0.8911680 1.0000000 -0.9320684
[5,] -0.27454721 0.69624071 0.8973204 -0.9320684 1.0000000
> vcov(fit2) |> cov2cor() |> "rownames<-"(NULL) |> "colnames<-"(NULL)
[,1] [,2] [,3] [,4] [,5] [,6]
[1,] 1.0000000 0.8048079 0.43799608 -0.69524806 0.8652693 -0.5323502
[2,] 0.8048079 1.0000000 0.33004594 -0.66902577 0.8314087 -0.5573719
[3,] 0.4379961 0.3300459 1.00000000 0.03054923 0.1271566 0.2735835
[4,] -0.6952481 -0.6690258 0.03054923 1.00000000 -0.8977969 0.8145789
[5,] 0.8652693 0.8314087 0.12715658 -0.89779689 1.0000000 -0.8025665
[6,] -0.5323502 -0.5573719 0.27358345 0.81457893 -0.8025665 1.0000000
@bakynkozhayev : any idea whether removing the intercept from the AFT baseline will affect your AFT paper?
Sincerely, Mark.
Running the default example from the
aft
help page yields an estimate for the numerical binary exposurehormon
as expected, with the intercept seemingly integrated into the baseline spline termnsx()
:However, if I include the
hormon
-variable as a factor, I get contrast coding with two df / parameters for hormonal exposure:I don't know whether this is intended, but it is not what I would expect when using a standard formula interface in an R modelling function.
Comments:
aft
-code below, seems to suffer from copy & paste:System info: