openturns / otmixmod

MixMod module
http://openturns.github.io/otmixmod/master/
GNU General Public License v3.0
2 stars 3 forks source link

otmixmod can make the Python kernel crash #24

Closed mbaudin47 closed 4 days ago

mbaudin47 commented 3 weeks ago

The next script considers the example presented in the unit test t_MixtureFactory_expert.py. For an increasing number of classes, it repeatedly creates a (X, Y) dataset. Then the parameters of the mixture of Gaussians are estimated using buildAsMixture().

# %%
import openturns as ot
import otmixmod

print("OT version = ", ot.__version__)
print("OTmixmod version = ", otmixmod.__version__)

# %%
model = ot.SymbolicFunction("x", "(1.0 + sign(x)) * cos(x) - (sign(x) - 1) * sin(2*x)")
inputDistribution = ot.Uniform()
sampleSizeTrain = 200

# %%
numberOfRepetitions = 20
listOfNumberOfClasses = list(range(2, 20, 4))
numberOfExperimentedClassesNumber = len(listOfNumberOfClasses)
for numberOfClasses in listOfNumberOfClasses:
    print(f"k = {numberOfClasses}")
    for repetitionIndex in range(numberOfRepetitions):
        print(f"  r = {repetitionIndex}")
        # Train
        dataXTrain = inputDistribution.getSample(sampleSizeTrain)
        dataYTrain = model(dataXTrain)

        # Create classification data
        agregatedData = ot.Sample(dataXTrain)
        agregatedData.stack(dataYTrain)

        # Classify data
        covModel = "Gaussian_pk_Lk_C"
        mixture, labels, logLike = otmixmod.MixtureFactory(
            numberOfClasses, covModel
        ).buildAsMixture(agregatedData)

The script produces:

OT version =  1.23
OTmixmod version =  0.17
k = 2
  r = 0
  r = 1
[...]
k = 14
  r = 0
  [...]
  r = 14
terminate called after throwing an instance of 'terminate called recursively
terminate called recursively (note from MBN: 14 times)
XEM::OtherException'
  what():  All models got errors

Then the Python kernel crashes.

This is a test on Windows with a Python distribution installed from Conda.

jschueller commented 3 weeks ago

it doesnt succeed on some samples; if you reset the rng between repetitions you will see that it doesnt throw when provided the same data but if I disable openmp I get a proper exception instead of a crash

mbaudin47 commented 3 weeks ago

The problem is not the repetition, sorry for that misleading analysis. It can be reproduced with a single call, provided we carefully select the seed.

import openturns as ot
import otmixmod

print("OT version = ", ot.__version__)
print("OTmixmod version = ", otmixmod.__version__)

model = ot.SymbolicFunction("x", "(1.0 + sign(x)) * cos(x) - (sign(x) - 1) * sin(2*x)")
inputDistribution = ot.Uniform()
ot.RandomGenerator.SetSeed(80)  # The failing sample
dataXTrain = inputDistribution.getSample(200)
dataYTrain = model(dataXTrain)
agregatedData = ot.Sample(dataXTrain)
agregatedData.stack(dataYTrain)
covModel = "Gaussian_pk_Lk_C"
mixture, labels, logLike = otmixmod.MixtureFactory(10, covModel).buildAsMixture(agregatedData)

The Linux error message is a little clearer.

OT version =  1.23
OTmixmod version =  0.17
Traceback (most recent call last):
  File "/home/devel/Documents/example_mixmod_fail_simpler_v2.py", line 17, in <module>
    mixture, labels, logLike = otmixmod.MixtureFactory(10, covModel).buildAsMixture(agregatedData)
                               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/devel/miniconda3/envs/otmixmod/lib/python3.12/site-packages/otmixmod/otmixmod.py", line 199, in buildAsMixture
    return _otmixmod.MixtureFactory_buildAsMixture(self, sample)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
RuntimeError: All models got errors
mbaudin47 commented 3 weeks ago

According to OtherException.h#L64, the error is produced at ClusteringMain.cpp#L280. This is in an OpenMP loop, which is why you tried to disable this option. Is this an uncatched exception? Whatever the reason, this is an upstream bug, for https://github.com/mixmod/mixmod/issues isn'it?

jschueller commented 4 days ago

ok with mixmod 2.1.11