deeppavlov / intent_classifier

Apache License 2.0
83 stars 31 forks source link

Trained my own data with 3 class but classification is working perfectly for one intent #8

Open sekharBuddha opened 6 years ago

sekharBuddha commented 6 years ago

Hello there, as i cloned your repo and added my own train.csv file and trained my data with reddit fastText now i am testing my data using ./intent_classifier.py config.json , but when i gave my pre-defined questions also it is not able to classify intents for my data every time it is showing only one class

Here is my train.csv i havent modified your config files can you help me please

request,applyForLoan,PostPaidSimcard,PrePaidSimcard can i apply for a loan,1,0,0 how to apply for a loan,1,0,0 where can i apply for loan,1,0,0 different ways to apply for a loan,1,0,0 can you help me to apply for loan,1,0,0 can you tell me what is a post paid sim card,0,1,0 what is post paid sim card,0,1,0 where can i apply for a post paid sim card,0,1,0 how to apply for post paid sim card,0,1,0 What is the status of my post paid sim card,0,1,0 can you tell me what is a pre paid sim card,0,0,1 what is pre paid sim card,0,0,1 where can i apply for a pre paid sim card,0,0,1 how to apply for pre paid sim card,0,0,1 What is the status of my pre paid sim card,0,0,1

dilyararimovna commented 6 years ago

This repo is not currently supported as intent classification is a part of an open-source library DeepPavlov.

Anyway, as you requested for a help. The only problem is that you did not try to tune parameters of model - it is very important because given config.json if appropriate for given data while you are trying to train a model on much less data (only several examples). Try this parameters:

{
  "model_path": "./cnn_model_0",
  "kernel_sizes_cnn": "1 2 3",
  "filters_cnn": 32,
  "embedding_size": 100,
  "lear_metrics": "binary_accuracy fmeasure",
  "confident_threshold": 0.5,
  "model_from_saved": false,
  "optimizer": "Adam",
  "lear_rate": 0.01,
  "lear_rate_decay": 0.001,
  "loss": "binary_crossentropy",
  "fasttext_model": "./reddit_fasttext_model.bin",
  "module": "fasttext",
  "text_size": 10,
  "coef_reg_cnn": 1e-2,
  "coef_reg_den": 1e-2,
  "dropout_rate": 0.5,
  "epochs": 100,
  "dense_size": 10,
  "model_name": "cnn_model",
  "batch_size": 4,
  "val_every_n_epochs": 1,
  "verbose": true,
  "val_patience": 2,
  "show_examples": true
}
sekharBuddha commented 6 years ago

thank you very much for your support, it worked perfectly now, but i have a query how did you tuned the config file(how you tuned the parameters can you explain me the process please) i was really happy with your replies .

dilyararimovna commented 6 years ago

I tried to reduce the number of trainable parameters (sizes of layers) because you have too few examples to train bigger model, and increased coefficients of regularization for convolutions and dense layers. Also I increased number of epochs, frequency of validation, reduced validation patience.

sekharBuddha commented 6 years ago

ok, what if i have more than 50 intent class how should i have to change them i am really confusing about these numbers ( should i have go along with probability of traiy and error method) please help me out, one more query what is Text size and validation patience ?

dilyararimovna commented 6 years ago

I think you are better to use this library https://github.com/deepmipt/DeepPavlov. It also contains good enough docs (see http://docs.deeppavlov.ai/en/latest/components/classifiers.html) where the explanation of parameters is available.

Tuning parameters is not so simple process, and can not be explained in short. You can try to use evolution of parameters (see http://docs.deeppavlov.ai/en/latest/intro/parameters_evolution.html) or try to change parameters by hands and control target metric.