r1chardj0n3s / parse

Parse strings using a specification based on the Python format() syntax.
http://pypi.python.org/pypi/parse
MIT License
1.71k stars 101 forks source link

Parsing datetime information across several formats #197

Open observingClouds opened 1 day ago

observingClouds commented 1 day ago

Thank you so much for this super helpful package!

I was wondering if there is a way to parse a datetime from several places within a string, basically the inverse to "Today is {d:%d} in {d:%B} of {d:%Y}".format(d=datetime.datetime(2020,10,5))

import datetime
import parse
pattern = "Today is {d:%Y%m%d}"
string = pattern.format(d=datetime.datetime(2023,11,21))

parse.parse(pattern, string)
#<Result () {'d': datetime.date(2023, 11, 21)}>  # works as expected

pattern = "We are writing the year {d:%Y} and it is the {d:%d}th on a cold {d:%B} day"
string = pattern.format(d=datetime.datetime(2023,11,21))
parse.parse(pattern, string)
# causes RepeatedNameError: field type '%d' for field "d" does not match previous seen type '%Y'
Traceback ```bash RepeatedNameError Traceback (most recent call last) Cell In[46], line 2 1 pattern = "We are writing the year {d:%Y} and it is the {d:%d}th on a cold {d:%B} day" ----> 2 parse.parse(pattern, string) File ~/mambaforge/envs/pyproj_env/lib/python3.13/site-packages/parse.py:959, in parse(format, string, extra_types, evaluate_result, case_sensitive) 933 def parse(format, string, extra_types=None, evaluate_result=True, case_sensitive=False): 934 """Using "format" attempt to pull values from "string". 935 936 The format must match the string contents exactly. If the value (...) 957 In the case there is no match parse() will return None. 958 """ --> 959 p = Parser(format, extra_types=extra_types, case_sensitive=case_sensitive) 960 return p.parse(string, evaluate_result=evaluate_result) File ~/mambaforge/envs/pyproj_env/lib/python3.13/site-packages/parse.py:432, in Parser.__init__(self, format, extra_types, case_sensitive) 430 self._group_index = 0 431 self._type_conversions = {} --> 432 self._expression = self._generate_expression() 433 self.__search_re = None 434 self.__match_re = None File ~/mambaforge/envs/pyproj_env/lib/python3.13/site-packages/parse.py:613, in Parser._generate_expression(self) 610 e.append(r"\}") 611 elif part[0] == "{" and part[-1] == "}": 612 # this will be a braces-delimited field to handle --> 613 e.append(self._handle_field(part)) 614 else: 615 # just some text to match 616 e.append(REGEX_SAFETY.sub(self._regex_replace, part)) File ~/mambaforge/envs/pyproj_env/lib/python3.13/site-packages/parse.py:662, in Parser._handle_field(self, field) 660 if name in self._name_to_group_map: 661 if self._name_types[name] != format: --> 662 raise RepeatedNameError( 663 'field type %r for field "%s" ' 664 "does not match previous seen type %r" 665 % (format, name, self._name_types[name]) 666 ) 667 group = self._name_to_group_map[name] 668 # match previously-seen value RepeatedNameError: field type '%d' for field "d" does not match previous seen type '%Y' ```
wimglenn commented 21 hours ago

It is not currently supported. Seems like it should be possible, but looks non-trivial to implement.

The limitation is not specific to datetimes.

>>> fmt = "{x} {x:02d}"
>>> fmt.format(x=3)
'3 03'
>>> parse.parse(fmt, '3 03')
...
RepeatedNameError: field type '02d' for field "x" does not match previous seen type ''