j0k3r / graby

Graby helps you extract article content from web pages
MIT License
367 stars 74 forks source link

Missing Accept header causes problems on some sites #155

Closed nijel closed 5 years ago

nijel commented 6 years ago

Graby fails to fetch content from https://www.marigold.cz/item/projektovy-manazer-je-v-cesku-sproste-slovo-ke-skode-projektu. The reason is that server fails with HTTP 403 when Accept header is missing.

I've discovered this in Wallabag, but I think Graby might be place to address this (or SafeCurl). All what is needed is to add Accept: */* header to performed requests.

I've hacked this into SafeCurl (see https://github.com/j0k3r/safecurl/pull/4), but I'm not really sure this is good location for such fix.

j0k3r commented 6 years ago

See https://github.com/j0k3r/safecurl/pull/5