facebookresearch / fastText

Library for fast text representation and classification.
https://fasttext.cc/
MIT License
25.83k stars 4.71k forks source link

Optimal params after autotune #913

Open phongvis opened 4 years ago

phongvis commented 4 years ago

After running model = fasttext.train_supervised(input='cooking.train', autotuneValidationFile='cooking.valid'), does the model object contain information about the optimal parameters? As in the documentation, the model is retrained with optimal params. However, when I inspect parameters through model, I only see default values such as for lr, epoch, minCount, etc.

Another related question: does the model retrain with a merge of train and validation sets or just a train set?

Many thanks.

Celebio commented 4 years ago

Hi @phongvis , You are right, the best parameters found are not reflected back in the model object. We will look for a clear way to communicate the best parameters found to the user. Thank you for your feedback on this.

Another related question: does the model retrain with a merge of train and validation sets or just a train set?

It retrains with the train set.

Best regards, Onur

phongvis commented 4 years ago

Thank you for your prompt reply. I look forward to that new feature.

ChiragSoni95 commented 4 years ago

Yes. I agree. It will be nicer if we can view the optimal hyper parameters after auto-tuning on validation file. Look forward to it!

Allenlaobai7 commented 4 years ago

887

arimbr commented 4 years ago

Hi @Celebio. I found a way to get the model parameters with the Python bindings by inspecting the model object with model.f.getArgs().

I use the following code in this repository to retrain the model on all the data.

train_parameters = {
    'lr': 0.1,
    'dim': 100,
    'ws': 5,
    'epoch': 5,
    'minCount': 1,
    'minCountLabel': 0,
    'minn': 0,
    'maxn': 0,
    'neg': 5,
    'wordNgrams': 1,
    'bucket': 2000000,
    'thread': multiprocessing.cpu_count() - 1,
    'lrUpdateRate': 100,
    't': 1e-4,
    'label': LABEL_SEPARATOR,
    'verbose': 2,
    'pretrainedVectors': '',
    'seed': 0,
}

def get_model_parameters(model):
    args_getter = model.f.getArgs()

    parameters = {}
    for param in train_parameters:
        attr = getattr(args_getter, param)
        if param == 'loss':
            attr = attr.name
        parameters[param] = attr

    return parameters