j0k3r / graby-site-config

Graby site config files
Other
18 stars 29 forks source link

Update mediapart: add date and author #27

Closed etiess closed 6 years ago

etiess commented 6 years ago

Test link with 2 authors: https://www.mediapart.fr/journal/france/131217/affaire-urvoas-ce-que-cache-thierry-solere

I'm updating mediapart here, as there's already a paywall. What should I do? (here or fivefilters/ftr-site-config ?)

etiess commented 6 years ago

It works now for authors, but I don't know if I did it in a clean way (see https://github.com/j0k3r/graby-site-config/pull/27/commits/1e495030c912a25c2efc11c23eb2eeb5e061a07c)

etiess commented 6 years ago

The date is matching but I think the conversion to a real usable date fail.

What should I do to make it usable?

j0k3r commented 6 years ago

Nothing for now it's because PHP can't parse a date like "13 décembre 2017": https://3v4l.org/kJsBr

tcitworld commented 6 years ago

Actually the date looks like this so I guess it could be retreived.

<time datetime="2017-12-13">13 décembre 2017</time>
j0k3r commented 6 years ago

@tcitworld true!

@etiess you can use that instead:

date: //div[contains(concat(' ',normalize-space(@class),' '),' author ')]//time/@datetime
etiess commented 6 years ago

done here: https://github.com/j0k3r/graby-site-config/pull/29 (I don't know how to modify this pull request once it's merged)