Noticed "id" was hard-coded so I changed that (by making it its own variable for the core df)
I did two things that sped up the equivalence test:
Indexed the data frame, so that subsetting was faster
Replaced Binning with qcut
I think the latter is OK, because all you have to do is compute bins, you don't need to do anything else fancy, so you don't need all of the stuff in the Binning class. Turns out Binning is pretty slow. I think the algorithm has not changed.
I tested this with some synthetic data and for 10K meters with seasonal 168 equivalence data the bin selection runs in 45 seconds.
Noticed "id" was hard-coded so I changed that (by making it its own variable for the core df)
I did two things that sped up the equivalence test:
I think the latter is OK, because all you have to do is compute bins, you don't need to do anything else fancy, so you don't need all of the stuff in the Binning class. Turns out Binning is pretty slow. I think the algorithm has not changed.
I tested this with some synthetic data and for 10K meters with seasonal 168 equivalence data the bin selection runs in 45 seconds.