transport-nantes / tn_web

site web des Mobilitains
https://www.mobilitains.fr/
GNU General Public License v3.0
16 stars 8 forks source link

press-mention og failure #1032

Closed JeffAbrahamson closed 1 year ago

JeffAbrahamson commented 1 year ago

The press-mention DOM parser (/presse/admin/new) doesn't see the open graph tags at this page:

https://actu.fr/pays-de-la-loire/nantes_44109/les-voitures-de-retour-sur-le-pont-saint-mihiel-a-nantes-le-symbole-d-une-metropole-qui-hesite_39297453.html

Shriukan33 commented 1 year ago

I've figured what's happening on actu.fr :

image

request.get has a 403 response, and can't access the article, they might fight scrapping actively.

I've found a workaround by modifying the request's user agent, see the PR :)