time-rs / time

The most used Rust library for date and time handling.
https://time-rs.github.io
Apache License 2.0
1.13k stars 281 forks source link

parsing: option to [end] to terminate parsing even if there is further input #684

Open wezm opened 5 months ago

wezm commented 5 months ago

I use time in my rsspls project (thanks!). It's a tool that uses CSS selectors to extract parts of web pages and build an RSS feed from them. time is used for parsing dates that will become the published date of the RSS item. In https://github.com/wezm/rsspls/issues/46 the element in the HTML that contains the date actually has two dates in it like this:

<td><td tabindex="0" role="cell" class="periodo-pubblicazione date">31/05/2024<br>  15/06/2024</td>

Which is "31/05/2024 15/06/2024" when extracted. We'd like to be able to parse the first date. This is similar in nature to https://github.com/time-rs/time/issues/471 but my idea is to add a modifier to the end component that allows it to be used even when all the input has not been consumed. This would allow using a format description like [day padding:zero]/[month padding:zero]/[year][end eof:false]

I'd be open to implementing this if it seems reasonable.

jhpratt commented 5 months ago

So…I definitely get where you're coming from. Any implementation of this would necessarily be a new method rather than a modifier on [end]. The reason for this is a tad involved, but I'll try to simplify as much as possible. After some layers of indirection for ergonomics, calls to parse end up calling Sealed::parse (in parsable.rs). This is ultimately where your desire lies — the value is parsed successfully, but fails because there is remaining input. Any [end] modifier is long gone by the time this situation is encountered.

Right now, the only way to approach this is to go through the Parsed struct directly. For example,

let mut parsed = Parsed::new();
let remaining = parsed.parse_items(format_description!("[day]/[month]/[year]"))?;
let value = parsed.into();

This is typed off hand, and naturally relies on some assumptions. You're experienced with Rust, so I trust you're able to figure that much out. Even within time, this is the approach that would need to be taken. I'm not necessarily opposed to having something more ergonomic, but I don't think it's trivial either (a new method isn't ideal).