mmaelicke / scikit-gstat

Geostatistical variogram estimation expansion in the scipy style
https://mmaelicke.github.io/scikit-gstat/
MIT License
223 stars 53 forks source link

How to plot spatial marginal variogram #156

Open WonmoKoo opened 1 year ago

WonmoKoo commented 1 year ago

Hello,

I have a question about plotting spatial marginal variogram. I want to plot the spatial variogram of temporal lag 0.

However, It seems that "SpaceTimeVariogram.plot(kind='marginals')" may generate the variogram with all point pairs in dataset, regardless of the temporal lag of each pair, because "SpaceTimeVariogram.create_XMarginal" creates an instance of Variogram which does not consider the time index of data points. image

Then, I think that the generated variogram by "SpaceTimeVariogram.plot(kind='marginals')" would be inconsistent with its explanation. image

I would like to ask if my understanding is correct and if there is a way to obtain a spatial marginal variogram (i.e., the spatial variogram of temporal lag 0).

Sincerely, Wonmo Koo

mmaelicke commented 1 year ago

Hi Wonmo,

If I got your question correct, you are right, there is a bit of confusion, but the docstring of create_XMarginal should be more descriptive. Consider this line:

https://github.com/mmaelicke/scikit-gstat/blob/f44de8b860cc45739a0a450d8b43569331722580/skgstat/SpaceTimeVariogram.py#L657

You can see, that the coordinates are stacked on top of each other as often as the amount of time steps. And directly in the next line:

https://github.com/mmaelicke/scikit-gstat/blob/f44de8b860cc45739a0a450d8b43569331722580/skgstat/SpaceTimeVariogram.py#L658

the transposed, flattened values are used. Thus, all observations from all timestamps on the time axis are used, which effectively means time lag is 0. Hence, this marginal variogram uses all observations at all spatial lags for the time lag 0. Note that this is different from the spatial variogram at timestep 0.

Please also be aware, that I am planning on refactoring the whole class, and implemented another (possibly additional) interface for specifying the two dimension axes in a more expressive way, as this caused a lot of confusion. That would also involve a method to explicitly generate marginal variogram for any kind of specified lag and would also work for space-time cross-variograms. I am just not yet sure, when I can start with that.

I hope that answers your question. Best,

Mirko

WonmoKoo commented 1 year ago

Thank you for your response!

However, I can't understand the sentence "Thus, all observations from all timestamps on the time axis are used, which effectively means time lag is 0." Even if we use all observations from all timestamps, wouldn't pairs of two data points observed at different timestamps still have a time lag that is not 0? Does the sentence mean considering all observations from all timestamps as if they were collected at the same timestamp?

I'm curious about your thoughts on how the marginal experimental spatial semivariogram should be calculated. I think it should be calculated as follows: image where b is the bin index and \delta is the bin width.

Sincerely, Wonmo Koo

mmaelicke commented 1 year ago

hmm. That makes sense. I have to admit, that I did the implementation (literally) years ago and cannot remember the details exactly. I did not use the class myself, eversince, so a wrong implementation is possible. I will have to look into this in way more detail, because I can remember that I spent a lot of time on that part of the implementation.

Sorry for the inconvenience. I'll come back to you after I checked the implementation more carefully.

WonmoKoo commented 1 year ago

Thank you very much!!