r1chardj0n3s / parse

Parse strings using a specification based on the Python format() syntax.
http://pypi.python.org/pypi/parse
MIT License
1.72k stars 101 forks source link

[request]millisecond support #127

Closed NaokiSato102 closed 10 months ago

NaokiSato102 commented 3 years ago

Unless I'm mistaken, this library does not support milliseconds. This library is very useful, but the lack of millisecond support is a letdown. Please support milliseconds to make it easier to use.

NaokiSato102 commented 3 years ago

If the millisecond digits were separated by dots, they read correctly, but not if they were colons.

jenisys commented 3 years ago

A clear description with an example, which format specifier you are using, what you observe and what you expect instead, would improve this issue.

NaokiSato102 commented 3 years ago

Oh sorry. That's right.

Example of how it does work

parse.parse({:tt}, "2:08:32.01") # Format is "hours:minutes:seconds:milliseconds"

Examples of how it doesn't work

parse.parse({:tt}, "2:08:32:01") # Format is "hours:minutes:seconds:milliseconds"

Expected output

Result (datetime.time(2, 8, 32, 10000), {}

We want to be able to separate all digits with a colon. The problem is that when you have a four-digit sequence of hours, minutes, seconds, and milliseconds, it will be easy to tell which digit is which, but with a three-digit sequence, you won't be able to distinguish between "hours, minutes, and seconds" and "minutes, seconds, and milliseconds".

One solution is to mimic dateutil.parser's response to the American date format (e.g., 1/12/2000) with the option "dayfirst=True" when the day is written first. In other words, if you need a three-digit format, how about adding an option like "millisecond preference=True"?

Sorry if the English meaning is hard to understand. I'm writing using DeepL translation as my English is not good.

r1chardj0n3s commented 3 years ago

I apologise that my library handles localisation so poorly :-(

I honestly am not sure how to address this issue though. I really don't like the idea of adding in an option flag like you suggest, as I feel like it is a slippery slope of complication in the API. However, I'm not sure what alternative there is. Perhaps parse() is trying to do too much here, in attempting to parse datetimes as well, especially in the face of localisation? Heck, Python has whole other libraries devoted to the complicated problem of parsing datetimes!

Do you think it might be reasonable to just request that either:

  1. you parse the time string out of the text and perform a secondary analysis with a proper datetime parsing library, or
  2. you write a custom type conversion that handles your specific time pattern and parses it correctly?
NaokiSato102 commented 3 years ago

It all depends on your design philosophy, but I think it's better to do 2 first, and then juxtapose 1 and 2 when the implementation is complete. I agree that adding more and more options is a way to increase the complexity.

I don't know the potential demand for this feature, and I can't build an algorithm that complex, so I can only offer my opinion (if I could, I would have sent a pull request in the first place! :’‐( ) But for now, I'd like you to try to implement it in the way described above. In your spare time.

wimglenn commented 10 months ago

Since parse v1.20.1 you can use the strptime directives:

>>> parse("{:%H:%M:%S:%f}", "2:08:32:01")
<Result (datetime.time(2, 8, 32, 10000),) {}>