Ouranosinc / xscen

A climate change scenario-building analysis framework.
https://xscen.readthedocs.io/
Apache License 2.0
17 stars 2 forks source link

Incorrect behaviour of subset_file_coverage when multiple files are in range #226

Closed RondeauG closed 1 year ago

RondeauG commented 1 year ago

Setup Information

Description

Since the recent change, catalog.subset_file_coverage fails when multiple files are located within the interval, because guessed_length results in multiple values instead of just one:

guessed_length = pd.IntervalIndex.from_arrays(
                intervals[files_in_range].map(
                    lambda x: max(x.left, period_interval.left)
                ),
                intervals[files_in_range].map(
                    lambda x: min(x.right, period_interval.right)
                ),
            ).length

Which gives: ValueError: The truth value of an array with more than one element is ambiguous. Use a.any() or a.all()

Steps To Reproduce

Search for multiple years of CRCM5 data (which is saved monthly).

Additional context

Applying sum(guessed_length, datetime.timedelta()) will make the line guessed_length / period_length < coverage work as intended.

Found by @vindelico

Contribution

aulemahal commented 1 year ago

Can we get a MWE of the bug with a traceback ?

~It does not happen on my side.~

EDIT. My bad, the search method will pass coverage=0, this the code in the issue is simply not executed. I overlooked that when implementing this!