Climate-Data-Science / Climate-Similarity-Metrics

Which similarity metrics are the most helpful to understand climate
0 stars 2 forks source link

Create a workflow to find geolocations related to the reference data #3

Closed pawelbielski closed 4 years ago

pawelbielski commented 4 years ago

Given a reference time series (firstly a chosen geopoint, later the QBO index), compute the point-wise similarities (correlation, or Mutual Information) on the map. The goal is that we start with one similarity function, for a fixed point.

Steps:

pawelbielski commented 4 years ago

Comments to 2_point-wise_similarities.ipynb:

pawelbielski commented 4 years ago

@pierretoussing The code you wrote is well structured and easily extendable. However, I think it might be possible to vectorize calculate_series_similarity(). Notice that sim_fun() is called 256 * 512 times, and all the runs are independent from each other.

pierretoussing commented 4 years ago

I made some research and I do not think that it is possible to vectorize it. I now use the np.apply_along_axis() which results in a minor speedup, but in order to avoid 256 * 512 calls of the sim_fun() I would have to vectorize the implementation of all the similarity measures which are imported from external libraries.

pawelbielski commented 4 years ago

Thanks for the explanation. What is the speedup you achieve with np.apply_along_axis()?

I ve created a separate issue #11 for this. For any further comments on speedup please answer there.

pierretoussing commented 4 years ago

I measured both executions with %%timeit and it returned a difference of 2 seconds, which are 7% in this context.