Closed ocefpaf closed 10 years ago
@rsignell-usgs:
I decided to create a few generic functions using Scipy's KDTree to find the nearest data to the stations position. The idea is to have something that is fast and easy to re-use. The tree helps with that. I have been testing this idea and trying to improve it. Here is a draft of what I have so far:
The ultimate goal would be to have a get_model(at_station)
function. It would take a station series and return a model series at the same place and time. This station series would carry time and space metadata with it.
Maybe I am investing too much time into this. But I think that we will be doing this a lot like in the glide-model comparison, so maybe it is worth it... What do you think?
@kwilcox, didn't you implement something like this for your paegan work?
For the python notebook example, cell 13 in, there might be a misspelling of 'NADV' for 'NAVD'
if row['datum'] == 'NADV':
Yep. thanks @jcothran that will be fixed in the next push.
@ocefpaf , when you say "This station series would carry time and space metadata with it." it starts to sound like a common data model object. And that makes me wonder if you could use an existing common data model object instead. I just had a gchat with @kwilcox and he said paegan is too immature to look at. Can you take a look at the Iris data model and see if it would work?
Check out cell [7] in http://nbviewer.ipython.org/gist/rsignell-usgs/d48242d13d17f9360d49 to see what an Iris time series object looks like. -Rich
@rsignell-usgs That is exactly the idea. I am not re-inventing the wheel. Iris time-series are pandas time-series with more metadata and that is what this get_model(at_station)
would take as input.
I am just developing this slowly to avoid messing up.
The prototype is ready. I am closing this issue so I can open a new ones addressing what we discussed here and what I found with this notebook.
Here is the current view: http://nbviewer.ipython.org/github/ioos/secoora/blob/master/notebooks/inundation/inundation_secoora.ipynb
Substitute
nearxy
andfind_ij
for Scipy's KDTree and test if this approach is faster and more general to both structured and unstructured models.https://github.com/ioos/secoora/blob/master/notebooks/inundation/inundation_secoora.ipynb#L1188
KDTree will allow to search all points of lon, lat that are near the observations. With the computed
tree
it is faster to find several points, but it can be slower to compute just one or a few.