sethblack / python-seo-analyzer

An SEO tool that analyzes the structure of a site, crawls the site, count words in the body of the site and warns of any technical SEO issues.
Other
1.18k stars 305 forks source link

Keyword Analysis #70

Open mpbunch opened 3 years ago

mpbunch commented 3 years ago

Describe the bug Words on keyword analysis list seem to have trailing characters missing. Not always, but frequently enough that I noticed it.

To Reproduce Steps to reproduce the behavior:

  1. Run python/cli code
  2. Navigate to keyword section of html output
  3. Observe
sethblack commented 3 years ago

Heyo! Correct, this is an artifact of stemming and lemmatization of the keywords. I added a "dumb" stemmer lookup that keeps a dictionary of the first occurrence of each word and maps it back to the stemmed version. After the analysis is complete the first occurrence word is the word you'll see in the keyword report - for example, you'll see the human-readable word "glasses" instead of some horrible internal mix of letters the stemmer came up with.

I should definitely add some documentation explaining this so people aren't caught off-guard.

Getseowebsite commented 3 years ago

How to Navigate to keyword section of html output?