Open JulienPeloton opened 2 months ago
Hi @JulienPeloton!
I believe the first time it happened in #314, which was the part of the v0.8.2 release. It changed the maximum number of iterations iminuit
does, it may be the reason.
https://github.com/light-curve/light-curve-python/pull/314/files#diff-141d756483363a5374c8002cfbd99530f859d19ea0bdc69edcb8a2aaae82b3f4R285
The next changed happened in #324, which was the part of the v0.9.0, we didn't change Rainbow since that release. That was a large change and it is not clear to me what exactly changed the performance.
@karpov-sv please tell us if you have any ideas here. I believe that the intention was making the fit more reasonable.
Yes, it was changed quite significantly, both in the number of iterations and in the selection of initial parameters. So the reason is also possibly what before it converged fast but to some random (wrong) point, while now it spends more time to converge to more appropriate one... I will need to check it visually to see how it actually behaves, will try to do it.
Thanks @hombit @karpov-sv -- trying to improve the result of the fit makes sense. But one should indeed check for example that minuit.strategy = 2
or ncall=10000
are not overkill here. Although I reckon this is difficult to make general statements here as it highly depends on the input data, I will check the convergence and the impact on the classification on the lightcurves from the Elasticc data challenge.
Some intermediate results from my tests (elasticc TDE sample, first 100 events without any pre-selection):
light-curve-py: 0.8.1
all: 0.063 +/- 0.020 seconds, 73.0% failed
success: 0.052 +/- 0.024 seconds
failure: 0.067 +/- 0.017 seconds
light-curve-py: 0.8.2
all: 0.129 +/- 0.105 seconds, 2.0% failed
success: 0.125 +/- 0.098 seconds
failure: 0.296 +/- 0.229 seconds
light-curve-py: 0.8.2 without strategy=2, defaults for migrad()
all: 0.063 +/- 0.019 seconds, 75.0% failed
success: 0.058 +/- 0.033 seconds
failure: 0.065 +/- 0.011 seconds
light-curve-py: 0.8.2 strategy=2, defaults for migrad()
all: 0.067 +/- 0.027 seconds, 81.0% failed
success: 0.068 +/- 0.044 seconds
failure: 0.066 +/- 0.021 seconds
light-curve-py: 0.8.2 without strategy=2
all: 0.106 +/- 0.078 seconds, 3.0% failed
success: 0.102 +/- 0.068 seconds
failure: 0.250 +/- 0.174 seconds
light-curve-py: 0.9.3
all: 0.164 +/- 0.096 seconds, 2.0% failed
success: 0.166 +/- 0.096 seconds
failure: 0.051 +/- 0.001 seconds
light-curve-py: 0.9.3 without strategy=2, defaults for migrad()
all: 0.084 +/- 0.028 seconds, 68.0% failed
success: 0.069 +/- 0.031 seconds
failure: 0.090 +/- 0.024 seconds
So 0.8.1 was fast mostly because it was failing for most of the cases, and for the majority of them it was just due to insufficient number of iterations. Just increasing it (0.8.1->0.8.2, more or less) greatly increases convergence rate, but of course is slower. There is some effect of the rewrite I did later (between 0.8.2 and 0.9.3) - it did not change the convergence rate but increased overall runtime. I will need to look into what exactly is the culprit... I changed initial conditions estimation there, and made some changes for the actual function being optimized.
Thanks @karpov-sv -- good catch, I did not pay attention to the ratio failure/success at that time. That's interesting that the rate of failures jumps between extreme cases. I should have a look at the alerts that successfully pass in one case and fail in another, to see if the returned fitted values make sense physically to start with.
Hello @hombit !
In Fink, I am still using the version 0.8.1. I started to migrate versions recently, but I noticed a loss of performances for Rainbow feature extraction. You'll find below a snippet of code to reproduce the findings (data here -- these are lightcurves from Elasticc):
with the following output:
So on this example there is a factor 1 to 3 between
0.8.1
and0.8.2
and a factor 3 up to 8 between0.8.1
and0.9.3
. I will keep digging by running a profiler later this week, but do you have already an idea of what could have caused this?