Closed yfa-vagelis closed 1 year ago
Python's datetime library is designed to work with valid dates in the Gregorian calendar system. Using it to parse dates from other calendar systems or invalid dates can lead to unexpected errors and it is not recommended.
If you're dealing with dates in a different calendar system, it's better to use a dedicated library for that system. There are many third-party libraries available for different calendar systems, such as Persian or Chinese calendars. Implementing a custom calendar system is usually not a good idea, as it can be complex and error-prone.
If you're receiving data from a source that doesn't validate dates, it's important to handle this appropriately. Converting invalid dates (like '30 Feb') to valid ones (like '2 Mar') might seem like a solution, but it can lead to more problems. For example, it can cause issues in leap years.
If you need to parse strings that might contain invalid dates, regardless of the reason, a good approach would be to use regular expressions. This allows you to extract specific patterns from strings without considering whether they form a valid date or not.
Here's an example of how you can do this:
import re
date = '30 Feb 2023'
# This pattern matches two digits (day),
# any number of white space characters, three letters (month),
# any number of white space characters, and four digits (year)
pattern = r'(?P<d>\d{2})\s+(?P<b>\w{3})\s+(?P<Y>\d{4})'
match = re.match(pattern, date.strip())
if match:
day = match.group('d')
month = match.group('b')
year = match.group('Y')
print(f'day: {day} month: {month} year: {year}')
else:
print('no match found')
Note that this code assumes that the date is always in the format 'DD MMM YYYY'. If the format can vary, you'll need to adjust the regular expression accordingly.
For more information on regular expressions, refer to the Python re module documentation.
@majid-vaghari Hello and thanks for your reply!
Upon further investigation, I discovered that the parsing is carried out within the _strptime._strptime
function, which generates a datetime
object to compute weekday
and julian
if needed. Consequently, I'm still encountering the ValueError: day is out of range for month
.
So, my question is, why isn't there a function available that exclusively performs parsing and provides the parsed fields without performing validation? Such a function would enable users to utilize the results as needed.
Edit: I was thinking about making a custom date parser using regex, as you suggested, but I wanted to avoid it since there is already a solution out there.
I understand your perspective and concerns. Python's datetime module is implemented in C and is designed to work with valid dates in the Gregorian calendar system, as seen in the source code. As such, it validates the parsed fields to ensure they form a valid date.
There are several reasons why the functionality you're suggesting is not part of the Python's datetime module:
Should the datetime module not validate dates, users would have to handle validation every time they used it, leading to potential inconsistencies and errors.
While it's technically possible to read the source code and use undocumented functions in Python and C to bypass the validation, doing so is not recommended. This approach is messy, prone to errors, and can lead to maintenance issues as these functions are not officially supported and could change in future versions of Python. Plus, it's not very practical and there's really no need to make such functionality available in Python's datetime.
While I understand that building a custom parser using regular expressions may not be the ideal solution, it may be the most viable one given your unique requirements. It offers the flexibility to parse the date fields without validation and handle invalid dates according to your specific needs.
Alternatively, you could explore third-party date parsing libraries that might better cater to your needs. Please bear in mind that dealing with invalid dates always requires careful consideration to avoid data inconsistencies and errors.
So, my question is, why isn't there a function available that exclusively performs parsing and provides the parsed fields without performing validation? Such a function would enable users to utilize the results as needed.
There must be one out there, somewhere. Customized needs can be achieved through 3-rd party packages.
@sunmy2019 Indeed, I'll have to choose a custom solution after all.
Thank you both anyway!
I don't know if this is already discussed, but I want to parse some date strings, but they are not always valid (e.g. 30 Feb 2023).
When I try to use the
datetime.strptime
I get an error, which is raised not because there is a problem with parsing the string, but because the parsed fields (i.e day, month, year) cannot create a validdatetime
object.Is it possible to get the parsed fields anyway?