JuliaSurv / NetSurvival.jl

A pure-Julia take on standard net survival routines
https://juliasurv.github.io/NetSurvival.jl/
MIT License
6 stars 0 forks source link

Nessie #47

Closed rimhajal closed 2 months ago

rimhajal commented 2 months ago

interface needs fixing

the results aren't exactly what i'm looking for but i have been stuck on this ... is there anything that sticks out to you on quick glance?

rimhajal commented 2 months ago

julia> Nessie(@formula(Surv(time,status)~sex), colrec, colrec.sex, slopop) ([3288.648962464143 2145.609944014756 … 55.901666088940075 15.974814139780039; 2681.6275632951492 1772.6068547689429 … 71.82278361787326 13.957914916445606], [13.004323842641433, 25.295201986907866])

nessie(Surv(time, stat) ~ sex, data = colrec, ratetable = slopop, rmap = list(age = age, sex = sex, year = diag)) sex=1 3289 3134.0 2983.4 2836.5 2693.6 2555.6 2421.9 2291.2 2162.7 2038.0 1920.2 1808.5 1702.9 1603.4 1510.1 1422.9 1339.3 1258.1 1179.1 1102.9 1028.4 954.6 881.8 15.2 sex=2 2682 2579.5 2479.2 2380.6 2283.4 2188.0 2093.2 1998.0 1902.0 1806.7 1714.7 1625.9 1540.8 1460.0 1383.7 1312.1 1243.2 1175.6 1109.6 1045.3 981.9 918.6 855.9 17.0

from the first year, there's a big difference between the two. Maybe the time loops are wrong?

lrnv commented 2 months ago

So why is rate_pred passed as an argument? It should not be necessary right ? The other fit function extracts it from the rate table directly IIRC

lrnv commented 2 months ago

Haaaa i think i got the solution: we let the run last until the last possible moment in the grid, that is until the last death, but we should let the loop run until 120 years, that is the end of the rate table. I think we should try that and if it works write the tests correctly. Then, our loop looks really inneficient but maybe it's RateTables.jl's job to provide the expected remaining life time for a given person directly...

lrnv commented 2 months ago

@rimhajal With the last push to the PR on RateTables.jl, it seems ot work and to produce the right numbers:

julia> x = nessie(@formula(Surv(time,status)~sex), colrec, slopop)
(2×2 DataFrame
 Row │ sex     expected_life_time 
     │ Symbol  Float64
─────┼────────────────────────────
   1 │ male               15.2094
   2 │ female             16.9782, 2×3 DataFrame
 Row │ sex     grid                               expected_sample_size
     │ Symbol  Array…                             Array…
─────┼──────────────────────────────────────────────────────────────────────────────
   1 │ male    [0.0, 365.241, 730.482, 1095.72,…  [3289.0, 3133.96, 2983.45, 2836.…
   2 │ female  [0.0, 365.241, 730.482, 1095.72,…  [2682.0, 2579.52, 2479.22, 2380.…)

julia> x[2][1,3]
23-element Vector{Float64}:
 3289.0
 3133.9552625257443
 2983.4502704649744
 2836.515553931062
 2693.622660886542
 2555.616023838256
 2421.939620989653
 2291.2277732240186
 2162.694193625734
 2037.9850260901428
 1920.250036204659
 1808.576030194914
 1702.9475638533381
 1603.425672216488
 1510.101536779181
 1422.9472561463979
 1339.3066605391687
 1258.1118851357537
 1179.1749681302872
 1102.907180537947
 1028.4768917062268
  954.4257626902065
  880.9459393878692

julia> x[2][2,3]
23-element Vector{Float64}:
 2682.0
 2579.5174220050385
 2479.221340174898
 2380.6440942810373
 2283.3771131983344
 2188.0359575554344
 2093.185295200068
 1997.9696204292184
 1901.9741246085762
 1806.7192434040187
 1714.7134710849537
 1625.9460915890554
 1540.8006350160022
 1460.0773507047522
 1383.750974088214
 1312.176526701044
 1243.2187676161443
 1175.6407898073405
 1109.623404759267
 1045.3104332404512
  981.9703553907678
  918.4269851710649
  855.1298088300433

julia> 

The issue was that, for a person aged 20 in 2010, we considered that its life ended in 2020 (at age 30) since the life table did not go further...

Next steps:

Can you do it ?

lrnv commented 2 months ago

@rimhajal Well done. So after this : https://github.com/JuliaRegistries/General/pull/106960 gets merged into the general registry, in this branch we need to up the compat to of RateTables to 0.1.1 (in project.toml) and then re-run the tests to see if it works.

The docs you wrote are OK.

But this lacks a test to compare to R still, right ?

rimhajal commented 2 months ago

Yes there is still the test

lrnv commented 2 months ago

Ok I let you bump the compat and write the test then.

Edit: Maybe we could merge #44 and rebase this PR on the main branch before writing the test, to write it coherently with the new tests.

rimhajal commented 2 months ago

Is it possible to rebase with the conflicts or will it make it worse?

lrnv commented 2 months ago

@rimhajal just sent you a zoom link by mail let's do it together

rimhajal commented 2 months ago

i don't understand why it's a dataframe and the test is erroring because of it

lrnv commented 2 months ago

@rimhajal extracting everything from the object is a bit more easy to avoid the dataframes. But I am still not convinced that this output dataframes is the best thing we can provide, maybe we just need to provide a list.... Not sure