scivision / linkchecker-markdown

Python asyncio + aiohttp Markdown *.md URL link checker: 10,000 files/second
MIT License
32 stars 18 forks source link

headless web browser improvements #1

Closed scivision closed 5 years ago

scivision commented 5 years ago

Too many anti-leech false positives these days. Despite Requests accepting cookies, the anti-leech are too clever. Need to try a web browser approach.

Perhaps Arsenic https://github.com/HDE/arsenic which is asyncio based.

The speedup from asyncio is naturally quite large, but we need a full browser to not get too many false positives.

scivision commented 5 years ago

The current implementation is not too bad for false positives. Waiting to see if this is really needed.