openturns / otmixmod

MixMod module
GNU General Public License v3.0
2 stars 3 forks source link

otmixmod can make the Python kernel crash #24

Closed mbaudin47 closed 4 days ago

mbaudin47 commented 3 weeks ago

The next script considers the example presented in the unit test For an increasing number of classes, it repeatedly creates a (X, Y) dataset. Then the parameters of the mixture of Gaussians are estimated using buildAsMixture().

# %%
import openturns as ot
import otmixmod

print("OT version = ", ot.__version__)
print("OTmixmod version = ", otmixmod.__version__)

# %%
model = ot.SymbolicFunction("x", "(1.0 + sign(x)) * cos(x) - (sign(x) - 1) * sin(2*x)")
inputDistribution = ot.Uniform()
sampleSizeTrain = 200

# %%
numberOfRepetitions = 20
listOfNumberOfClasses = list(range(2, 20, 4))
numberOfExperimentedClassesNumber = len(listOfNumberOfClasses)
for numberOfClasses in listOfNumberOfClasses:
    print(f"k = {numberOfClasses}")
    for repetitionIndex in range(numberOfRepetitions):
        print(f"  r = {repetitionIndex}")
        # Train
        dataXTrain = inputDistribution.getSample(sampleSizeTrain)
        dataYTrain = model(dataXTrain)

        # Create classification data
        agregatedData = ot.Sample(dataXTrain)

        # Classify data
        covModel = "Gaussian_pk_Lk_C"
        mixture, labels, logLike = otmixmod.MixtureFactory(
            numberOfClasses, covModel

The script produces:

OT version =  1.23
OTmixmod version =  0.17
k = 2
  r = 0
  r = 1
k = 14
  r = 0
  r = 14
terminate called after throwing an instance of 'terminate called recursively
terminate called recursively (note from MBN: 14 times)
  what():  All models got errors

Then the Python kernel crashes.

This is a test on Windows with a Python distribution installed from Conda.

jschueller commented 3 weeks ago

it doesnt succeed on some samples; if you reset the rng between repetitions you will see that it doesnt throw when provided the same data but if I disable openmp I get a proper exception instead of a crash

mbaudin47 commented 3 weeks ago

The problem is not the repetition, sorry for that misleading analysis. It can be reproduced with a single call, provided we carefully select the seed.

import openturns as ot
import otmixmod

print("OT version = ", ot.__version__)
print("OTmixmod version = ", otmixmod.__version__)

model = ot.SymbolicFunction("x", "(1.0 + sign(x)) * cos(x) - (sign(x) - 1) * sin(2*x)")
inputDistribution = ot.Uniform()
ot.RandomGenerator.SetSeed(80)  # The failing sample
dataXTrain = inputDistribution.getSample(200)
dataYTrain = model(dataXTrain)
agregatedData = ot.Sample(dataXTrain)
covModel = "Gaussian_pk_Lk_C"
mixture, labels, logLike = otmixmod.MixtureFactory(10, covModel).buildAsMixture(agregatedData)

The Linux error message is a little clearer.

OT version =  1.23
OTmixmod version =  0.17
Traceback (most recent call last):
  File "/home/devel/Documents/", line 17, in <module>
    mixture, labels, logLike = otmixmod.MixtureFactory(10, covModel).buildAsMixture(agregatedData)
  File "/home/devel/miniconda3/envs/otmixmod/lib/python3.12/site-packages/otmixmod/", line 199, in buildAsMixture
    return _otmixmod.MixtureFactory_buildAsMixture(self, sample)
RuntimeError: All models got errors
mbaudin47 commented 3 weeks ago

According to OtherException.h#L64, the error is produced at ClusteringMain.cpp#L280. This is in an OpenMP loop, which is why you tried to disable this option. Is this an uncatched exception? Whatever the reason, this is an upstream bug, for isn'it?

jschueller commented 4 days ago

ok with mixmod 2.1.11