yrobink / SBCK

22 stars 9 forks source link

dOTC: Temporal evolution different than in the paper? #8

Closed coxipi closed 1 month ago

coxipi commented 2 months ago

Hi! We have been working on an implementation of dOTC in xclim. The wrapper of SBCK in xclim works well, but we wanted to experiment in implementing more multivariate methods directly in xclim.

In SBCK, the motion is computed between adjusted time series: motion = yX1 - yX0 whereas, in the paper introducing dOTC, the proposed way to compute the motion is to find cell values $c_i, c_k$ in a probabilistic way, chosen by using the plans between X0,Y0 and X0,X1, and compute the difference.

I understand that yX0 and yX1 are obtained by using those aforementioned plans. But I still have the impression that in computing the motion through the time series, we can compare points that are very far apart and that would not be compared in the method proposed in the paper, since the transport plans would not allow to choose cells $c_i, c_k$ that are so far apart.

Is the implementation in the SBCK repo somewhat equivalent to the description in the paper and I'm not seeing it, or not?

yrobink commented 1 month ago

Hello, first of all thank you for your interest and for wanting to port dOTC to xclim.

In fact, we need to go back to what yX0 and yX1 contain. We can see that they are defined through transport plans:

yX0 = otcY0X0.predict(Y0)
yX1 = otcX0X1.predict(yX0)

Here, yX0 and yX1 are now centers of cells $c_i$ and $ck$, and such that they follow the laws $P{X^0}$ and $P_{X^1}$ (because they are images by transport plans). The advantage of using the time series of $Y_0$ at the base is:

  1. to have the cells $c_i$ and $c_k$ which are aligned with Y0, and Y1 can be defined directly as Y0+motion (within a factor).
  2. The weights of $c_i$ and $c_k$ are integrated into the time series (and the $c_k$ are obtained through OTC in a probabilistic manner) which must respect the departure and arrival weights.

Transport plans can send nearby points to distant points (or vice versa). In this type of case, the problem comes from estimating the departure and arrival probability measure, which may not have enough points to make a good estimate.

I hope I've answered your questions,

Yoann

coxipi commented 1 month ago

Ok I see it now. The point I was missing is that in the conversions Y0 -> yX0 -> yX1, the fitted OTC is applied (.predict) on Y0 and yX0, and yX0 inherits the time structure of Y0, and the same is true for yX1. Thanks for this explanation! Have a good day.

Éric