Open calvinmetcalf opened 11 years ago
The most simple approach I guess would be to separate the list into sure-fire bullshit and everything else, and only mark words from the second list if a particular threshold of words detected with the first one reached.
Any updates? Would love to see an improvement on this, finally @mourner @calvinmetcalf.
moved from #1
Could add a point value to words, or just put them in groups with the same bullshit level, and modify the bs value based on the proximity to other bullshit words i.e. with a threshold of 1, 'monetize' might have 1.2 and always be bullshit, but 'functionality' 0.8 so not bullshit but if 3 words away from 'empowerment', 0.8 then bullshit, 0.8+(0.8/3)=1.07.
The easy way would just have a couple lists instead of the one list of bullshit words, the always bullshit valued at 1.2, the could be bullshit at 0.8 etc.