recurve-methods / comparison_groups

Repository for discussion of Comparison Group topics
25 stars 5 forks source link

speed up equivalence; fix ID column #12

Closed mariano-recurve closed 4 years ago

mariano-recurve commented 4 years ago

Noticed "id" was hard-coded so I changed that (by making it its own variable for the core df)

I did two things that sped up the equivalence test:

I think the latter is OK, because all you have to do is compute bins, you don't need to do anything else fancy, so you don't need all of the stuff in the Binning class. Turns out Binning is pretty slow. I think the algorithm has not changed.

I tested this with some synthetic data and for 10K meters with seasonal 168 equivalence data the bin selection runs in 45 seconds.