Closed andremacola closed 1 month ago
A real example:
In the url: https://natelinha.uol.com.br/famosos/2024/10/13/esposa-de-rodrigo-faro-vera-viel-compartilha-rotina-no-hospital-apos-cirurgia-217809.php the extraction of the date return empty because the html only has ld+json metadata for this info.
This is fixed with this PR.
For future: I'm thinking to do a more complex logic for ld+json extraction. There is sites for example (like the above one) that include a real author name for the post and not the @sitename like in og:author.
@sitename
og:author
@andremacola thank you for your contribution. I will check and merge this soon.
A real example:
In the url: https://natelinha.uol.com.br/famosos/2024/10/13/esposa-de-rodrigo-faro-vera-viel-compartilha-rotina-no-hospital-apos-cirurgia-217809.php the extraction of the date return empty because the html only has ld+json metadata for this info.
This is fixed with this PR.
For future: I'm thinking to do a more complex logic for ld+json extraction. There is sites for example (like the above one) that include a real author name for the post and not the
@sitename
like inog:author
.