xdvom03 / klaus

Bayesian text classification of websites in a nested class system
Creative Commons Zero v1.0 Universal
2 stars 0 forks source link

Update documentation #48

Closed xdvom03 closed 3 years ago

xdvom03 commented 3 years ago

Particular flaws to fix:

This also serves to slow down the flood of half-made features and solidify some of the basics.

xdvom03 commented 3 years ago

A necessary condition is getting a fixed algorithm wrt normalization. http://www.ra.ethz.ch/cdstore/www6/posters/725/web_search.html suggests normalizing documents to the same size, somethign I don't do. However, it also does little against word repetition.

xdvom03 commented 3 years ago

Whoops, many of the points raised here are still unresolved. What was resolved was the final point: some stability & fixes. But the main point still stands.

xdvom03 commented 3 years ago

Got accidentally re-closed.