Closed chseifert closed 8 years ago
What's so funny about these keywords? ;) Checking for visibility unfortunately is a real performance killer. I've integrated a version into c4 which seems acceptable in terms of computing time (~ up to 200ms, only tested with a few pages), but is not perfectly accurate (works fine for the provided twitter example). Checking for visibility can be switched off by passing the corresponding options parameter to the paragraphDetection method. Some numbers to get an intuition: ~20ms without check for visibility ~200ms with current implementation ~2000ms with in-depth check (still the last one does not capture opacity or similar)
@chseifert could you check if this issue is fixed and if yes close it?
Reactivated my twitter account and checked. Works :)
The following funny keywords are extracted on a Twitter page (https://twitter.com/eexcess) IF the user is LOGGED IN
The html code of the page (IF logged in) is attached twitter_html.txt