Closed JeffAbrahamson closed 1 year ago
I've figured what's happening on actu.fr :
request.get has a 403 response, and can't access the article, they might fight scrapping actively.
I've found a workaround by modifying the request's user agent, see the PR :)
The press-mention DOM parser (
/presse/admin/new
) doesn't see the open graph tags at this page:https://actu.fr/pays-de-la-loire/nantes_44109/les-voitures-de-retour-sur-le-pont-saint-mihiel-a-nantes-le-symbole-d-une-metropole-qui-hesite_39297453.html