Results not saved due to issues with get_digest

jsosulski commented 3 years ago

The get_digest function yields duplicate hash values even if two classifier objects have different internal state via different lambda functions. https://github.com/NeuroTechX/moabb/blob/fd0ffe66572a273a6a2d19174a21cd9c4e4e94c9/moabb/analysis/results.py#L111 Short example:

pipeline_a = FancyClassifier(kernel=lambda d: 1/d**2)
pipeline_b = FancyClassifier(kernel=lambda d: 1/d**4)

As of now, after running moabb the results file / dataframe would contain 2 scores for pipeline_b as the get_digest returns the equivalent hash for both pipelines. (The memory addresses identifying the different lambda functions are wiped, but I do not really know why)

Is there a reason moabb does not simply use the pipeline description string? Or just the hash thereof? I.e.:

digest = get_digest(name)

sylvchev commented 3 years ago

This is a good question and it may be related to the problem in doc generation of PR #174 I'll try to dig a little but I think it is safe to switch to pipeline name if the hash isn't working.

sylvchev commented 3 years ago

I could not reproduce your problem:

from moabb.analysis.results import get_digest

from mne.decoding import CSP
from sklearn.discriminant_analysis import LinearDiscriminantAnalysis as LDA
from sklearn.linear_model import LogisticRegression
from sklearn.pipeline import make_pipeline

pipelines = {}
pipelines["CSP+LDA"] = make_pipeline(CSP(n_components=8), LDA())  # reference
pipelines["CSP+LR"] = make_pipeline(CSP(n_components=8), LogisticRegression())  # different one
pipelines["csp+lda"] = make_pipeline(CSP(n_components=8), LDA())  # same classif, different name
pipelines["csp+lda2"] = make_pipeline(CSP(n_components=3), LDA(shrinkage='auto'))  # same classif, different param

still_working_1 = LDA()  # reference
still_working_2 = LogisticRegression()  # different one
still_working_3 = LDA()  # same one, different variable name
still_working_4 = LDA(shrinkage='auto')  # same classif, different parameter

for k in pipelines.keys():
    print(get_digest(pipelines[k]))

print()
for ppl in [still_working_1, still_working_2, still_working_3, still_working_4]:
    print(get_digest(ppl))

and I obtain something like :

fc7da40c87159d79f754c0d3687bc41b  # ref
173243258cf654f9574eba2aa2db0877  # different
fc7da40c87159d79f754c0d3687bc41b  # same parameter, different name, but the same hash
50ff561fa9649a72629547cca677c3ba  # different parameter, different name

7726943eab9cd77cc24191070dd6b8f3  # ref
5e93bafe4deb5def698059b1fb5c6c6c  # different
7726943eab9cd77cc24191070dd6b8f3  # same parameter, different var name, hash is the same
e275ac57e9b681147072665947ff91d3  # different parameter, this the case you mention but the hash is different

jsosulski commented 3 years ago

Maybe a MWE from my side:

from moabb.analysis.results import get_digest

class Classifier:
    def __init__(self, kernel):
        self.kernel = kernel

still_working_1 = Classifier(kernel=lambda x: x**2)
still_working_2 = Classifier(kernel=lambda x: 5*x)

print()
for ppl in [still_working_1, still_working_2]:
    print(get_digest(ppl))

This will print (on my machine):

6b00f7db88baff5a57a2c5ab23fba0fd
6b00f7db88baff5a57a2c5ab23fba0fd

I know the issue is caused by wiping memory addresses during get_string_rep, which probably was intended to avoid having duplicate classifier entries when subsequent calls to moabb would allocate different memory locations.

sylvchev commented 3 years ago

Yes, the get_string_rep may be what is causing your problem, but does this behavior change when you make a Classifier that inherit from (BaseEstimator, BaseClassifier) of scikit, and by defining some phony fit and predict function?

jsosulski commented 3 years ago

Yes I do have a custom LDA classifier inheriting from the scikit estimators that shows the same behaviour. And I know that the issue comes from get_string_rep as it removes memory addresses, see for example:

from moabb.analysis.results import get_string_rep

class Classifier:
    def __init__(self, kernel):
        self.kernel = kernel

still_working_1 = Classifier(kernel=lambda x: x**2)
still_working_2 = Classifier(kernel=lambda x: 5*x)

for ppl in [still_working_1, still_working_2]:
    print(repr(ppl))
    print(get_string_rep(ppl))

My points are:

I think moabb should not concern itself with whether we want to evaluate the same classifier twice, should it?
Even if moabb should, as it is implemented right now, it does not skip the 'duplicate' classifier, but pools results of both of these together under the pipeline name of the latest one, which cannot be what we want, imo.

sylvchev commented 3 years ago

Interesting, I'm wondering if the lambda expression isn't playing a role here. Anyway, I agree with you on the second point : MOABB should not pool the results of both classifier. For the first point, I'm not sure about your concern. There is a overwrite flag to indicate if you want to reevaluate you pipeline or if you want to reuse the computation made on a previous computation. I think it is good if we could save computer time to avoid computing each time the same pipeline on the same dataset (for example with baseline).

Do you have an idea on how could we avoid the pooling part?

jsosulski commented 3 years ago

Interesting, I'm wondering if the lambda expression isn't playing a role here. Anyway, I agree with you on the second point : MOABB should not pool the results of both classifier. For the first point, I'm not sure about your concern. There is a overwrite flag to indicate if you want to reevaluate you pipeline or if you want to reuse the computation made on a previous computation. I think it is good if we could save computer time to avoid computing each time the same pipeline on the same dataset (for example with baseline).

I do agree that moabb should save computer time and avoid re-calculating pipelines. However, I personally have overwrite always True, because otherwise I need to track whether I changed any paradigm parameters (frequency filtering / anything else) as moabb does not notice this if I would have.

Do you have an idea on how could we avoid the pooling part?

Not really for now, as I usually avoid using cached results whenever possible and am not aware of the specific behaviour. But we could raise an error / a warning whenever a classifier that contains lambda functions in their constructor is used with moabb. Note that as a temporary workaround you can always use an explicitly named function instead of a lambda function, which does not depend solely on the memory address. This can just be tedious if you want to evaluate many differently parameterized kernels.

sylvchev commented 3 years ago

Yes, we could raise a warning if there is a lambda function.

Could be interesting to use GridSearchCV for evaluating your different kernel instead of creating separate pipelines? Something along the lines of:

param = {kernel=[lambda x: x**2, lambda x: 5*x]}

SimpleKernel = GridSearchCV(Classifier(), param)

pipeline = {}
pipelines["opt_kern"] = Pipeline(steps=[('dumbkern', SimpleKernel)])

jsosulski commented 3 years ago

That would be possible when I only want to report performance of the best performing kernel. But this way I do not know which kernel actually worked best, and could not compare the classification performance of two different kernels.

NeuroTechX / moabb

Results not saved due to issues with get_digest #194