Closed Dortaj closed 1 month ago
Hi @Dortaj, thanks for raising this!
holidays
package doesn't provide any numpy related functionality directly. It seems that the issue on a higher level of abstraction. If you provide some code we might be able to help you better. Perhaps you would have better results if used date
instead of datatime
objects (just my guess).
Meanwhile here is an AI generated response based on the issue text:
The issue arises because
np.isin()
is designed to work with arrays where direct equality comparisons are valid, such as numbers or strings. However,np.isin()
does not work correctly with complex data types likedatetime
objects, especially when comparing against a set or collection that involves more complex membership logic, like theholidays
object.Expected Behavior
The expectation is that
np.isin()
should behave similarly to thein
operator applied element-wise to check membership within aholidays
object. This would allow you to check if any datetime objects in the numpy array are considered holidays.Actual Behavior
np.isin()
fails to recognize datetime objects as being in theholidays
object, even when some of them are indeed holidays. The function returnsFalse
for all elements, indicating none of them are holidays, which is incorrect.Steps to Reproduce
import numpy as np import datetime import holidays # Create an np.array of datetime objects date_array = np.array([ datetime.datetime(2024, 12, 25), # Christmas, should be a holiday datetime.datetime(2024, 12, 31), # New Year's Eve, might be a holiday datetime.datetime(2024, 11, 1) # A regular day, not a holiday ]) # Create a holidays object us_holidays = holidays.UnitedStates(years=[2024]) # Attempt to use np.isin to check for holidays result = np.isin(date_array, us_holidays) print(result) # Expecting [True, False, False], but will likely get [False, False, False]
Problem Explanation
np.isin()
is likely failing because it's performing a straightforward equality check, which doesn't account for the waydatetime
objects need to be compared to the entries in theholidays
object.Workaround
You can achieve the desired behavior by applying a vectorized or element-wise approach to check membership:
# Vectorized approach using a list comprehension result = np.array([date in us_holidays for date in date_array]) print(result) # This should return [True, False, False]
Environment Details
- OS: MacOS
- Python version: 3.12
- holidays version: 0.54
This issue is less about a bug in numpy and more about a limitation in
np.isin()
with certain complex data types likedatetime
. The workaround using list comprehension or another vectorized approach should give you the correct results.
According to docs,
element and _testelements are converted to arrays if they are not already
For correct conversion and comparison, you should use something like
date_array = np.array([
datetime.date(2024, 12, 25), # Christmas, should be a holiday
datetime.date(2024, 12, 31), # New Year's Eve, might be a holiday
datetime.date(2024, 11, 1) # A regular day, not a holiday
])
us_holidays = holidays.UnitedStates(years=[2024])
result = np.isin(date_array, list(us_holidays.keys()))
print(result)
Bug Report
If we have an np.array of date times want to check if they exist in a holidays object we cannot use np.isin while it works fine if we apply in on each element separately that means we should apply it with other tricks.
Expected Behavior
np.isin should have the same behavior as applying in on each element of a given np.array of datetimes.
Actual Behavior
Currently, using np.isin cannot recognize anything, and it returns basically none of the elements those exist in holidays. If we use "in" independently on each of these elements, they are fine.
Steps to Reproduce the Problem
Environment