Open tomvothecoder opened 5 hours ago
@xCDAT/core-developers Any input would be appreciated here. Thanks.
I think you can go with option 1 (drop the assertion). Even though I raised concern about removing this, when I review this code carefully, I don't think it is needed. Do you remember why it was put there in the first place?
I don't think I helped write this section of code, but I sometimes add these kind of tests to make sure NaNs aren't messing things up. I don't think that should be an issue in this instance.
Do you remember why it was put there in the first place?
I based my initial implementation of _get_weights()
on the Xarray notebook here, which includes that validation code. It was good to have while implementing the logic in xCDAT, but I don't think it is necessary in production.
Is your feature request related to a problem?
In PR #689, in the
TemporalAccessor._get_weights()
method, I removed validation that checks the sums of weights for each time group adds up to 1 (related code).This was done because:
-O
flag), not good practice. I copied this validation code from an Xarray notebook (link) without realizing the implications.np.testing.assert_allclose()
degrades performance -- I expect the performance degradation to scale up in relation to the size to the data, maybe it loads data into memory with.values
?Describe the solution you'd like
What is an assertion and when to use it
Option 1: Don't keep this assertion
I think the sum of weights for each group should always be 1.0 (100%) based on our implementation logic. Otherwise, the assertion would indicate a coding error on our behalf rather than bad data (although we can raise an exception if it is indeed actually due to bad time bounds). As far as I can tell, our implementation logic is right and nobody has ran into
_get_weights()
throwing anAssertionError
from the validation code.The
_get_weights()
method works by:time_lengths
)grouped_time_lengths
)grouped_time_length / grouped_time_lengths.sum()
)https://github.com/xCDAT/xcdat/blob/e9e73dd98e20b77e690fd3b4cf6df947a9f58e60/xcdat/temporal.py#L1213-L1262
Option 2: Keep this assertion
Maybe the time bounds might not be correct for some reason and it produces incorrect weight for certain groups (e.g., missing data)? Not sure here. I will try experimenting with bad time bounds.
If we want to keep this assertion:
np.testing.assert_allclose()
in production since it raises anAssertionError
that can be turned off by the usernp.testing.assert_allclose()
RuntimeError
if numpy assertion error is raisedDescribe alternatives you've considered
No response
Additional context
After this ticket is addressed, we can proceed with releasing v0.7.2.