therneau / survival

Survival package for R
381 stars 104 forks source link

Functionality for Fast Univariate Tests of Tabular Input #227

Closed DarioS closed 1 year ago

DarioS commented 1 year ago

Similar to rowFtests and colFtests in genefilter, could survival have functionality which does fast computation in C or C++ over either the rows or columns of tabular data, to allow univariate feature selection of variables associated with survival? colCoxTests?

therneau commented 1 year ago

Could you be a little clearer about what exactly you want? (Most of the serious computation in the survival package is already in C, by the way).

I make a guess about what you want, which is to test for the significance of gene 'X' on survival, say after adjusting for age? There is a fast approximation for this: the sum of x* marginale-residuals is the numerator of the score test for the addition of 'x' to the regression, so you can very easily rank the variables. But variance is a bit harder.

If I am correct, I'm not sure this belongs in the survival package; it might more logically be placed into genefilter.

DarioS commented 1 year ago

Thanks for the suggestion. Yes, fast computation on thousands of variables, one-at-a-time. I agree that it might be better suited to genefilter or a similar package and I will discuss it with the package developer there.