Closed Ablesius closed 2 years ago
This makes the following possible:
import os
import pandas as pd
from nlubridge.vendors import TfidfIntentClassifier
from nlubridge import NLUdataset
dataset = NLUdataset(texts, intents)
dataset = dataset.shuffle()
classifier = TfidfIntentClassifier()
train, test = dataset.train_test_split(test_size=0.25, random_state=0, stratification=None) # stratification can be configured; if you don't set it, it uses the same setting as before!
classifier = classifier.train_intent(train)
predicted = classifier.test_intent(test)
res = pd.DataFrame(list(zip(test.intents, predicted)), columns=['true', 'predicted'])
I can disable stratification! And the code will still work.
This commit makes a backwards-compatible change; the default behaviour is the same as before, but now you can use the stratification parameter of the method to pass anything you need to sklearn's train_test_split stratify parameter.