yangheng95 / PyABSA

Sentiment Analysis, Text Classification, Text Augmentation, Text Adversarial defense, etc.;
https://pyabsa.readthedocs.io
MIT License
923 stars 159 forks source link

If Review contains numbers or emojis, its not generating any entities #214

Closed ImSanjayChintha closed 2 years ago

ImSanjayChintha commented 2 years ago

I am applying PyABASA package on amazon mobile phone reviews and its not generating attributes when the review contains numbers or emojis.

For example : iPhone 12. Best phone 😍 Genuine product thanks a lot amazon I purchase this divice 20 jan 2022 almost work fine. Best one

For above reviews and similar ones its not generating entities with sentiment. I really appreciate if this issue can be resolved.

yangheng95 commented 2 years ago

Actually I test it and it works fine, just like:

image image

You can test it here: https://huggingface.co/spaces/yangheng/Multilingual-Aspect-Based-Sentiment-Analysis

yangheng95 commented 2 years ago

Please try try to update to latest version which may resolve this problem

ImSanjayChintha commented 2 years ago

There were some long statements that we are not getting entities like for below one: "After Long use I am facing following issues After Long use I am facing following issues. 1. Battery: it’s totally disappointed. Not even run any games recently daily use like Instagram, Facebook, YouTube, WhatsApp it self battery drops rapidly. Maximum 5-6 hrs of battery life that’s it. Every time I wanted to charge my phone. 2. Screen dimming: This phone is not good for gaming. Performance is best no issue even running games in highest graphics but phone gets heat and display brightness gets low even my brightness at 100%. So I uninstalled some games. 3. Call failed: this issue I faced lots of time. While my phone connected to Wi-Fi after sometime my mom tries to call me but she can’t. For my side also it will show the error like call failed even my sim signal is full. Same happens after restarting my phone."

Please try above one. Its generating nothing.

ImSanjayChintha commented 2 years ago

By any chance, can we re-train the model with amazon mobile reviews?

yangheng95 commented 2 years ago

Yes, you can retrain the model based on existing checkpoints: https://github.com/yangheng95/PyABSA/blob/release/demos/aspect_term_extraction/train_atepc_based_on_checkpoint.py

yangheng95 commented 2 years ago

There were some long statements that we are not getting entities like for below one: "After Long use I am facing following issues After Long use I am facing following issues. 1. Battery: it’s totally disappointed. Not even run any games recently daily use like Instagram, Facebook, YouTube, WhatsApp it self battery drops rapidly. Maximum 5-6 hrs of battery life that’s it. Every time I wanted to charge my phone. 2. Screen dimming: This phone is not good for gaming. Performance is best no issue even running games in highest graphics but phone gets heat and display brightness gets low even my brightness at 100%. So I uninstalled some games. 3. Call failed: this issue I faced lots of time. While my phone connected to Wi-Fi after sometime my mom tries to call me but she can’t. For my side also it will show the error like call failed even my sim signal is full. Same happens after restarting my phone."

Please try above one. Its generating nothing.

You may have noted the tip on the demo page saying dont input text longer than 80 words, you can set a longer max_seq_len=256 or 512 to overcome this problem

ImSanjayChintha commented 2 years ago

There were some long statements that we are not getting entities like for below one: "After Long use I am facing following issues After Long use I am facing following issues. 1. Battery: it’s totally disappointed. Not even run any games recently daily use like Instagram, Facebook, YouTube, WhatsApp it self battery drops rapidly. Maximum 5-6 hrs of battery life that’s it. Every time I wanted to charge my phone. 2. Screen dimming: This phone is not good for gaming. Performance is best no issue even running games in highest graphics but phone gets heat and display brightness gets low even my brightness at 100%. So I uninstalled some games. 3. Call failed: this issue I faced lots of time. While my phone connected to Wi-Fi after sometime my mom tries to call me but she can’t. For my side also it will show the error like call failed even my sim signal is full. Same happens after restarting my phone." Please try above one. Its generating nothing.

You may have noted the tip on the demo page saying dont input text longer than 80 words, you can set a longer max_seq_len=256 or 512 to overcome this problem

Yes I have tired with max_seq_len=10000 but it's not working :-(

yangheng95 commented 2 years ago

Did you retrain the model using new max seq len? And the maximum is 512.

ImSanjayChintha commented 2 years ago

Did you retrain the model using new max seq len? And the maximum is 512.

No, I have used same model, but can we exceed the limit in existing model. I am not expert and not sure how to re-train the model.

yangheng95 commented 2 years ago

No, if you want set larger max_seq_len. please train your own model which is time-consuming. max_seq_len > 512 may be reset 512, depends on transformers package.

ImSanjayChintha commented 2 years ago

is there any dataset that I can see? so that I can train same pattern of my data, especially phone dataset?

yangheng95 commented 2 years ago

https://github.com/yangheng95/ABSADatasets/tree/v1.2/datasets/atepc_datasets/102.Chinese/107.phone

yangheng95 commented 2 years ago

There is no English phone dataset yet.

ImSanjayChintha commented 2 years ago

How this portal is running with Phone dataset which is exactly getting expected output

https://huggingface.co/spaces/yangheng/Multilingual-Aspect-Based-Sentiment-Analysis

yangheng95 commented 2 years ago

These datasets are just examples used for random sampling and have no relation to checkpoints. In other words, all languages are based on the same checkpoint. Because this checkpoint is trained on a multilingual (all) dataset with ABSADatasets, it performs well. However, it is not trained on the English phone dataset.

ImSanjayChintha commented 2 years ago

oh ok. Thank you so much for clarifying it. can you please share the code on how can we use multilingual checkpoint?

yangheng95 commented 2 years ago

https://huggingface.co/spaces/yangheng/Multilingual-Aspect-Based-Sentiment-Analysis/blob/main/app.py