TimeEval / GutenTAG

GutenTAG is an extensible tool to generate time series datasets with and without anomalies; integrated with TimeEval.
MIT License
71 stars 13 forks source link

Type Mismatch Error When Using Integer Data with Custom Input #30

Closed B-Deforce closed 1 year ago

B-Deforce commented 1 year ago

Hello,

I encountered an issue when using the custom_input feature with my own dataset, which contains integer values. When I tried to inject an anomaly into my data, I received a numpy.core._exceptions._UFuncOutputCastingError. This error occurred because the apply_variations function in the Consolidator class tried to add float64 values (from bo.noise, bo.trend_series, and bo.offset) to my int64 time series data, resulting in a type mismatch.

Here is the error message:

numpy.core._exceptions._UFuncOutputCastingError: Cannot cast ufunc 'add' output from dtype('float64') to dtype('int64') with casting rule 'same_kind'

This error occurred at the following line in the consolidator.py file:

self.timeseries[:, c] += bo.noise + bo.trend_series + bo.offset

The documentation for the CustomInput class does not explicitly state that the input data must be of type float, but I guess in general gutenTAG is designed to work with floating-point time series data.

Suggested Fix:

To prevent this error, I suggest adding a simple check in the CustomInput class to ensure that the input data is of type float. If the data is of type integer, we can automatically convert it to float (maybe with a warning to inform the user).

if df.dtypes[0] == 'int64':
    df = df.astype(float)
CodeLionX commented 1 year ago

Hi Boje,

Thanks for spotting this bug and suggesting a fix. You are right that we were short-sighted regarding our input types, and your suggestion makes sense!