brave / brave-browser

Brave browser for Android, iOS, Linux, macOS, Windows.
https://brave.com
Mozilla Public License 2.0
17.74k stars 2.31k forks source link

[Speedreader] reconsider date metadata #40369

Open rillian opened 2 months ago

rillian commented 2 months ago

Description

@bridiver complained about the time rust crate dependency we use to parse and display publication date metadata. Prior to #10965 we used the chrono crate instead, but migrated away for poor maintenance. Later it was reintroduced as a dependency of skus and feed-rs, but never migrated the speedreader code back. So we're using both calendar libraries unnecessarily. There are also some smaller libraries, like humantime and speeddate which support RFC 3339.

Brian also mentioned he'd never seen the relevant feature in action. I noticed we only parse dates from json-ld scripts. These are still used by a number of large publications, e.g. The New York Times. The output looks like this: Screenshot from 2024-08-06 10-04-15 But we don't handle the <meta name="article:published_time" content="..."> tags used by e.g. many wordpress blogs.

I see two resolutions here, both removing the time dependency.

Any thoughts about the importance of this feature?

Steps to reproduce

Here is an example of the date parsing.

rillian commented 2 months ago

Personally, I find the metadata panel quite helpful in evaluating articles, and having that in a standard place is an advantage reader mode offers. What do you think, @boocmp?

boocmp commented 2 months ago

I think we should keep the date, so the first option is preferable.

bridiver commented 2 months ago

I think we should keep the date, so the first option is preferable.

Today neither dep for time would satisfy our criteria for rust deps so I'm going to rule out option one. There is a third option though which is to expose the chromium cpp code.