TeamHG-Memex / sklearn-crfsuite

scikit-learn inspired API for CRFsuite
426 stars 215 forks source link

Inconsistent results #38

Open Joselinejamy opened 5 years ago

Joselinejamy commented 5 years ago

Hello, On running the following code, am getting different recall, precision and f1 scores on different runs over the same dataset.

crf = sklearn_crfsuite.CRF(
    algorithm='lbfgs',
    c1=0.1,
    c2=0.1,
    max_iterations=100,
    all_possible_transitions=True
)
crf.fit(X_train, y_train)

Came to know that this is expected by going through one of the issues, but is it possible to set any random seed to a constant value to get consistent results for same input ?

severinsimmler commented 4 years ago

+1

severinsimmler commented 4 years ago

This is related to https://github.com/TeamHG-Memex/sklearn-crfsuite/issues/9.

franklevasseur commented 4 years ago

@severinsimmler, I actually think that it's not realted to #9 as both reported issues use different optimizers. The training with lbfgs optimizer does not call the dataset_shuffle function.

liaeh commented 3 years ago

I'm also having this problem... makes it very difficult to compare e.g. the benefit of including/excluding different features