calculateBestNode claims no nodesWithText on facebook developer page

ageitgey / node-unfluff

Automatically extract body content (and other cool stuff) from an html document

Apache License 2.0

2.15k stars 221 forks source link

If you look at the html source of https://developers.facebook.com/docs/facebook-login/access-tokens, it seems like all the actual page text is commented out (inside html comments) instead of being normal text in the page. I'm guessing some client-side javascript runs on their page to render what you see on the screen after the initial page load.

So you would need to do your own custom processing to capture what the actual browser rendered after page load since the initial html that comes back from their servers doesn't actually include the page text. That's a special case specific to this website that is beyond anything that unfluff could support directly.

ageitgey / node-unfluff

calculateBestNode claims no nodesWithText on facebook developer page #44