dstein64 / highlight

A browser extension for automatically highlighting the important content on article pages.
MIT License
54 stars 13 forks source link

Add details to README.md about how exactly this extension works #13

Open Korb opened 3 months ago

Korb commented 3 months ago

At the moment, the information provided is not enough to understand what exactly is meant by the wording "the important content", and based on what criteria this content will be searched for in the text of web pages.

dstein64 commented 3 months ago

I implemented this a long time ago (over 9 years ago), and don't recall the details.

I browsed the code to review the algorithm.

A sentence's importance is calculated by assigning a score for each word in the sentence, and summing the scores. A word's score is based on its frequency throughout the document (higher scores for higher frequency). The score of long sentences is reduced, to account for having a higher score from more words.

https://github.com/dstein64/highlight/blob/3bf1319748af0760d901853b2719b3549cb6d542/src/content.js#L1067-L1186