Why am I getting poor prediction results?

RubixML / Sentiment

An example project using a feed-forward neural network for text sentiment classification trained with 25,000 movie reviews from the IMDB website.

https://rubixml.com

MIT License

108 stars 13 forks source link

Why am I getting poor prediction results? #5

Open harryqt opened 3 years ago

harryqt commented 3 years ago

harry@ubuntu-server:~/sentiment$ php74 predict.php
Enter some text to analyze:
rubixml is great
The sentiment is: negative

harry@ubuntu-server:~/sentiment$ php74 predict.php
Enter some text to analyze:
Rubix ML is really great
The sentiment is: positive

andrewdalpino commented 3 years ago

Hahahhaa hilarious @Dibbyo456 ... ohhhhhh machine learning you! But yeah there can be many reasons for poor real-world performance even on seemingly obvious (for us) examples. What is the validation score your model achieves after training? Were you able to visualize training loss and validation score at each epoch?

The first thing I'd do is try to identify if your model is underfit or overfit. Also note that the model is trained on movie reviews (usually about a couple paragraphs) and therefore is biased (see selection bias) somewhat toward larger blobs of text.

Did you modify the training script at all or did you stick with the default settings?

harryqt commented 3 years ago

Hahahhaa hilarious.... ohhhhhh machine learning you!

🤣

Did you modify the training script at all or did you stick with the default settings?

The only change I did was AdaMax to Adam because of Tensor extension issue as seen on #4

About all the other questions you asked, I have no idea.. But I will try to figure them out.. hopefully.

harryqt commented 3 years ago

msedge_2021-05-21_10-02-24

msedge_2021-05-21_10-02-33

andrewdalpino commented 3 years ago

Sounds good @Dibbyo456, I reminded the Zephir team of the issue just now. Unfortunately, due to a lack of resources we haven't had a chance to improve the demo model on rubixml.com in a very long time - it's likely that someone with any time budget and access to compute will be able to create a more accurate model.

Let us know how well your model performs during CV and training I'll be happy to help you further from there.