Configure stratification in train_test_split

This makes the following possible:

import os
import pandas as pd
from nlubridge.vendors import TfidfIntentClassifier
from nlubridge import NLUdataset 

dataset = NLUdataset(texts, intents)
dataset = dataset.shuffle()
classifier = TfidfIntentClassifier()

train, test = dataset.train_test_split(test_size=0.25, random_state=0, stratification=None)    # stratification can be configured; if you don't set it, it uses the same setting as before!

classifier = classifier.train_intent(train)
predicted = classifier.test_intent(test)
res = pd.DataFrame(list(zip(test.intents, predicted)), columns=['true', 'predicted'])

I can disable stratification! And the code will still work.

telekom / nlu-bridge

Configure stratification in train_test_split #9