light-curve / light-curve-python

light-curve feature extraction library for Python
GNU General Public License v3.0
16 stars 5 forks source link

Noticeable loss of performances for Rainbow feature extraction after 0.8.1 #392

Open JulienPeloton opened 2 months ago

JulienPeloton commented 2 months ago

Hello @hombit !

In Fink, I am still using the version 0.8.1. I started to migrate versions recently, but I noticed a loss of performances for Rainbow feature extraction. You'll find below a snippet of code to reproduce the findings (data here -- these are lightcurves from Elasticc):

# python 3.9.12
import pandas as pd # 1.3.5
import numpy as np # 1.23.5
import glob
import time
import light_curve

if light_curve.__version__ < '0.8.2':
    from light_curve.light_curve_py import RainbowRisingFit as RainbowFit
else:
    from light_curve.light_curve_py.features.rainbow import RainbowFit

band_wave_aa = {'u': 3671.0, 'g': 4827.0, 'r': 6223.0, 'i': 7546.0, 'z': 8691.0, 'Y': 9712.0}

n_loop = 10

for fn in glob.glob('example_data/*parquet'):
    container = []
    for _ in range(n_loop):
        pdf = pd.read_parquet(fn)
        pdf = pdf.sort_values('midPointTai')

        t0 = time.perf_counter()
        feature = RainbowFit.from_angstrom(band_wave_aa, with_baseline=False)

        fluxnorm = pdf["cpsFlux"].values / pdf["cpsFlux"].max()
        fluxerr_norm = pdf["cpsFluxErr"].values / pdf["cpsFlux"].max()
        values = feature(
            pdf["midPointTai"].values,
            fluxnorm,
            sigma=fluxerr_norm,
            band=pdf["filterName"].values
        )
        container.append(time.perf_counter() - t0)
    print("{}: {:.3f} +/- {:.3f} second".format(fn, np.mean(container), np.std(container)))

with the following output:

# 0.8.1
example_data/rainbow_data_2610680.parquet: 0.052 +/- 0.003 second
example_data/rainbow_data_37566659.parquet: 0.020 +/- 0.001 second
example_data/rainbow_data_41478614.parquet: 0.015 +/- 0.001 second

# 0.8.2 -- overflow encountered
example_data/rainbow_data_2610680.parquet: 0.066 +/- 0.003 second
example_data/rainbow_data_37566659.parquet: 0.065 +/- 0.002 second
example_data/rainbow_data_41478614.parquet: 0.033 +/- 0.002 second

# 0.9.3
example_data/rainbow_data_2610680.parquet: 0.155 +/- 0.007 second
example_data/rainbow_data_37566659.parquet: 0.120 +/- 0.003 second
example_data/rainbow_data_41478614.parquet: 0.116 +/- 0.003 second

So on this example there is a factor 1 to 3 between 0.8.1 and 0.8.2 and a factor 3 up to 8 between 0.8.1 and 0.9.3. I will keep digging by running a profiler later this week, but do you have already an idea of what could have caused this?

hombit commented 2 months ago

Hi @JulienPeloton!

I believe the first time it happened in #314, which was the part of the v0.8.2 release. It changed the maximum number of iterations iminuit does, it may be the reason. https://github.com/light-curve/light-curve-python/pull/314/files#diff-141d756483363a5374c8002cfbd99530f859d19ea0bdc69edcb8a2aaae82b3f4R285

The next changed happened in #324, which was the part of the v0.9.0, we didn't change Rainbow since that release. That was a large change and it is not clear to me what exactly changed the performance.

@karpov-sv please tell us if you have any ideas here. I believe that the intention was making the fit more reasonable.

karpov-sv commented 2 months ago

Yes, it was changed quite significantly, both in the number of iterations and in the selection of initial parameters. So the reason is also possibly what before it converged fast but to some random (wrong) point, while now it spends more time to converge to more appropriate one... I will need to check it visually to see how it actually behaves, will try to do it.

JulienPeloton commented 2 months ago

Thanks @hombit @karpov-sv -- trying to improve the result of the fit makes sense. But one should indeed check for example that minuit.strategy = 2 or ncall=10000 are not overkill here. Although I reckon this is difficult to make general statements here as it highly depends on the input data, I will check the convergence and the impact on the classification on the lightcurves from the Elasticc data challenge.

karpov-sv commented 2 weeks ago

Some intermediate results from my tests (elasticc TDE sample, first 100 events without any pre-selection):

light-curve-py: 0.8.1
all: 0.063 +/- 0.020 seconds, 73.0% failed
success: 0.052 +/- 0.024 seconds
failure: 0.067 +/- 0.017 seconds

light-curve-py: 0.8.2
all: 0.129 +/- 0.105 seconds, 2.0% failed
success: 0.125 +/- 0.098 seconds
failure: 0.296 +/- 0.229 seconds

light-curve-py: 0.8.2 without strategy=2, defaults for migrad()
all: 0.063 +/- 0.019 seconds, 75.0% failed
success: 0.058 +/- 0.033 seconds
failure: 0.065 +/- 0.011 seconds

light-curve-py: 0.8.2 strategy=2, defaults for migrad()
all: 0.067 +/- 0.027 seconds, 81.0% failed
success: 0.068 +/- 0.044 seconds
failure: 0.066 +/- 0.021 seconds

light-curve-py: 0.8.2 without strategy=2
all: 0.106 +/- 0.078 seconds, 3.0% failed
success: 0.102 +/- 0.068 seconds
failure: 0.250 +/- 0.174 seconds

light-curve-py: 0.9.3
all: 0.164 +/- 0.096 seconds, 2.0% failed
success: 0.166 +/- 0.096 seconds
failure: 0.051 +/- 0.001 seconds

light-curve-py: 0.9.3  without strategy=2, defaults for migrad()
all: 0.084 +/- 0.028 seconds, 68.0% failed
success: 0.069 +/- 0.031 seconds
failure: 0.090 +/- 0.024 seconds

So 0.8.1 was fast mostly because it was failing for most of the cases, and for the majority of them it was just due to insufficient number of iterations. Just increasing it (0.8.1->0.8.2, more or less) greatly increases convergence rate, but of course is slower. There is some effect of the rewrite I did later (between 0.8.2 and 0.9.3) - it did not change the convergence rate but increased overall runtime. I will need to look into what exactly is the culprit... I changed initial conditions estimation there, and made some changes for the actual function being optimized.

JulienPeloton commented 1 week ago

Thanks @karpov-sv -- good catch, I did not pay attention to the ratio failure/success at that time. That's interesting that the rate of failures jumps between extreme cases. I should have a look at the alerts that successfully pass in one case and fail in another, to see if the returned fitted values make sense physically to start with.