wesleybowman / UTide

Python distribution of the MatLab package UTide
MIT License
126 stars 63 forks source link

Allow multiple columns of time series as input for prediction #77

Open rsignell-usgs opened 4 years ago

rsignell-usgs commented 4 years ago

We have a use case where we want to analyze the tides at each grid cell in a numerical model, so we have lots of time series with the same time base. How hard would it be to allow a matrix (or list of arrays) as input data to the solver?

efiring commented 4 years ago

The present code is heavily based on working with a single series at a time, and it is doing all sorts of things that you would not actually want for your application. Rather than trying to work more vectorization into the present code, I think that what is needed is a separate function specifically designed for the case where you specify the set of constituents, generate the pseudo-inverse of the model matrix, and then apply it to the array of time series.

rsignell-usgs commented 4 years ago

So I could just do a least squares fit to the tidal constituents at all grid cells and then do the astronomical adjustments based on a single time series using utide. Is that what you are thinking @efiring?

efiring commented 4 years ago

Not sure what you mean by "astronomical adjustments". What I have in mind is factoring out the calculation of the model matrix, "B", which is roughly lines 229-277 in _solve.py. Then write a completely new main entry function, e.g., "solve_vectorized", that would take the minimum arguments required for the special case where the constituents to use are specified, there are no missing values in the time array and the (U,V) or H array, and the U, V, H arrays can have more than one dimension, one of which is time. Nothing fancy, no confidence intervals. The lstsq function can handle multiple right-hand sides as a 2-D array; I haven't looked closely, but I think it is vectorized at the C level, so it should be fast. An alternative would be to use something like linalg.pinv2 to get the pseudo-inverse (once), and then matrix-multiply.

DanCodiga commented 4 years ago

If I understand what you've asked for Rich, it is something I included in the Matlab UTIde functions. It's explained in the tech report. May not be the solution you need but I thought I'd just mention it.

rsignell-usgs commented 3 years ago

Thanks @DanCodiga , would you have any bandwidth to contribute this here? It would be a huge benefit to the modeling community!

efiring commented 3 years ago

@rsignell-usgs Please clarify: is what I described above what you are looking for? What do you mean by "astronomical adjustments"?

DanCodiga commented 3 years ago

Thanks @DanCodiga , would you have any bandwidth to contribute this here? It would be a huge benefit to the modeling community!

I do aspire to get back on to some tidal analysis work but unfortunately it's looking like it won't be for at least another few months or longer. Would like to update the Matlab version of UTide-- and also to help flesh out the python version so it has all (or at least most) of the features in the Matlab version (unless that's too disruptive to what the community has built... really appreciate what everybody has done here!).

As to this specific question, now. I put the option to pass in an array of time series in to the Matlab version. However I also recall that it is basically a wrapper which implements convenience looping, rather than a vectorized approach. Something vectorized, like Eric has suggested, would be more powerful but also could be a tricky chunk of effort. Not sure this is helping much, but it's my $0.02 for now.