PoonLab / Kaphi

Kernel-embedded ABC-SMC for phylodynamic inference
GNU Affero General Public License v3.0
4 stars 2 forks source link

Linear interpolation for not-yet-sampled lineages #28

Closed ArtPoon closed 7 years ago

ArtPoon commented 7 years ago

The functionsimcoal.R:solve.A.mx() attempts a numerical solution of the ODE defined for the time differential of the A matrix, which represents the number of lineages per deme (columns) over coalescent (reverse) time (rows).

There seems to be a problem when one or more tips share the same sample height. In the following example, there are 10 tips (sampled lineages), of which 5 were sampled at height 0 and 5 were sampled at height 5.

The numerical solution for A seems to make sense:

> odA[,2]
  [1] 10.0000000  9.9990524  9.9980434  9.9969767  9.9958335  9.9946077  9.9912967
  [8]  9.9857463  9.9798173  9.9734765  9.9666921  9.9594249  9.9515829  9.9431616
 [15]  9.9341489  9.9244512  9.9140338  9.9028100  9.8907590  9.8778023  9.8638435
 [22]  9.8488504  9.8327174  9.8152404  9.7964088  9.7760982  9.7541963  9.7306515
 [29]  9.7053315  9.6780195  9.6486798  9.6171166  9.5831978  9.5467410  9.5072452
 [36]  9.4647858  9.4191915  9.3703111  9.3178661  9.2616768  9.2015377  9.1372304
 [43]  9.0685395  8.9952256  8.9170925  8.8331706  8.7438679  8.6490141  8.5484277
 [50]  8.4419071  8.3293246  8.2105541  8.0855163  7.9541241  7.8164045  7.6723116
 [57]  7.5205391  7.3624673  7.1982683  7.0281699  6.8524534  6.6714669  6.4855980
 [64]  6.2953147  6.1011332  5.9035997  5.7033039  5.4990122  5.2932205  5.0865718
 [71]  4.8797999  4.6736230  4.4687199  4.2657660  4.0654267  3.8683308  3.6750242
 [78]  3.4860542  3.3002343  3.1196935  2.9448147  2.7759163  2.6132292  2.4569670
 [85]  2.3072419  2.1641459  2.0277004  1.8978861  1.7746300  1.6568001  1.5454055
 [92]  1.4402775  1.3412391  1.2480833  1.1606064  1.0785600  1.0017248  0.9298505
 [99]  0.8626970  0.8000171

However, the not.sampled.yet function returns the following values on the same timeline:

> t(sapply(haxis, function(h) c(h, not.sampled.yet(h))))
             [,1] [,2]
  [1,]  0.0000000    8
  [2,]  0.9090909    5
  [3,]  1.8181818    5
  [4,]  2.7272727    5
  [5,]  3.6363636    5
  [6,]  4.5454545    5
  [7,]  5.4545455    0
  [8,]  6.3636364    0
  [9,]  7.2727273    0
 [10,]  8.1818182    0
 [11,]  9.0909091    0
 [12,] 10.0000000    0

Something silly is happening at h=0. The not.sampled.yet function results from the following linear interpolation:

    nsy.index <- approxfun(
        x=sort(jitter(sorted.sample.heights,
            factor=max(1e-6, sorted.sample.heights[length(sorted.sample.heights)]/1e6))),
        y=1:n, method='constant', rule=2
    )
    not.sampled.yet <- function(h) {
        cumul.sns[nsy.index(h), ]
    }