Text Summarization Technique for the Employee Feddback Box

sayantikabanik / capstone_isb

Using a combination of multiple facets of Health, Engagement and Productivity to estimate occupational burnout

Other

1 stars 1 forks source link

Text Summarization Technique for the Employee Feddback Box #27

Closed AnizD closed 2 years ago

AnizD commented 2 years ago

Hi ! Want you all to have a look at the quality of the output 👇🏼. The data fed in: All Review Text for 1) Infosys, 2) Infosys Pune and 3) Infosys Pune Technology Analysts. For the Feedback box, Edmundson looks better than the others. Also, do look at the bonus words, stigma words Let me know your thoughts in terms of the chosen technique.

Please refer to the python code under Experiments: Text Summarization - Sumy Package - Infosys.ipynb

sayantikabanik commented 2 years ago

https://github.com/sayantikabanik/capstone_isb/blob/main/experiments/Text%20Summarization%20-%20Sumy%20Package%20-%20Infosys.ipynb

Link to file ☝🏽

sayantikabanik commented 2 years ago

Some points to note here:

- Bonus words
These are the words pointing towards the important sentences. These may include superlatives ,adverbs etc.
- Stigma words
These are the words that have negative addect on the sentence importance. It includes anaphoric expressions, belittling expressions, etc.(We may expect the machine to treat them important but they are not really.)
- Null words
These aare the neutral or irrelevant words to the importance of sentences. These words are much like stopwords.

From the code, I don't think was, this, etc contribute to Stigma words unless there is some evidence of negative weights.

sayantikabanik commented 2 years ago

Do check the default dictionary corpus for null and stigma words for Edmundson/sumy

AnizD commented 2 years ago

Have updated the Bonus and Stigma words. This is done based on the top words/bigrams/trigrams in the corpus (Review Text, Pros, Cons).

Link to the latest version of the code: https://github.com/sayantikabanik/capstone_isb/blob/main/experiments/Text%20Summarization%20-%20Sumy%20Package%20-%20Infosys_V2.ipynb

sayantikabanik commented 2 years ago

LGTM thanks @AnizD