biolab / orange3-timeseries

🍊 :chart_with_upwards_trend: Orange add-on for analyzing, visualizing, manipulating, and forecasting time series data.
Other
62 stars 41 forks source link

Moving Transform: duplicated variable names #181

Closed wvdvegte closed 2 years ago

wvdvegte commented 2 years ago
Timeseries version

0.3.12

Orange version

3.30.2, Mac OS 11.6.1 (20G224) on Intel

Expected behavior

Moving Transform will work with a given time series dataset independent of preceding widgets

Actual behavior

Moving Transform crashes if appended to a larger workflow. If I save the data that it has to be applied to in a separate file, and create a new workflow with Moving Average acting on that file, it works as expected.

Steps to reproduce the behavior

Load the file "moving transform in larger workflow test.ows", load the file "coffee vending machine.xlsx" in its file widget. Apply the workflow: the Moving Transform widget will crash. Save the data that enters Moving Transform in a separate file (attached as "saved data before moving average.csv"), open the file "moving transform test.ows" and use the csv file as its input. Moving Transform will work as expected, with exactly the same input data.

Additional info (worksheets, data, screenshots, ...)

Zip file attached

Crash report: Error encountered in widget Moving Transform: Traceback (most recent call last): File "/Applications/Orange3.app/Contents/Frameworks/Python.framework/Versions/3.8/lib/python3.8/site-packages/orangewidget/gui.py", line 1981, in do_commit commit() File "/Applications/Orange3.app/Contents/Frameworks/Python.framework/Versions/3.8/lib/python3.8/site-packages/orangecontrib/timeseries/widgets/owmovingtransform.py", line 277, in commit ts = moving_transform(data, self.table_model, self.non_overlapping and self.fixed_wlen) File "/Applications/Orange3.app/Contents/Frameworks/Python.framework/Versions/3.8/lib/python3.8/site-packages/orangecontrib/timeseries/functions.py", line 546, in moving_transform ts = Timeseries.from_numpy(Domain(data.domain.attributes + tuple(attrs), File "/Applications/Orange3.app/Contents/Frameworks/Python.framework/Versions/3.8/lib/python3.8/site-packages/Orange/data/domain.py", line 153, in init raise Exception('All variables in the domain should have' Exception: All variables in the domain should have unique names.

Archive.zip

janezd commented 2 years ago

Note to whomever works on this: the problem are non-unique names of variables.

The widget actually calls a function timeseries.widgets.utils.available_name, which adds numbers to non-unique names. This should be changed to using the "standard" Orange.data.util.get_unique_names.

However, neither of these functions works if the proposed names themselves are non-unique. The problem in the attached workflow is that the user intended to create three variables with the same name, so any of above functions would (I think) assign them the same number.

Here's a to-do

wvdvegte commented 2 years ago

It was me who created that workflow, and I wasn't aware that it is sending variables with non-unique names to Moving Transform. Actually, Moving Transform receives the following variables:

- drink_vol_cumul
- cold_vol_cumul
- hot_vol_cumul
- Drip_tray_full
- Maintenance_alert_active
- Drip_tray_full (diff; order=1)
- Maintenance_alert_active (diff; order=1)
- drink_vol_cumul (diff; order=1)
- cold_vol_cumul (diff; order=1)
- hot_vol_cumul (diff; order=1)

These are all different, but some of them only differ in the suffix that was automatically added by Difference. Could it be that variable names are truncated somewhere in the process, and therefore are seen as identical?

janezd commented 2 years ago

No, Moving Average was not receiving non-unique variables -- it is impossible to create data with non-unique names. The workflow crashed because Moving Average tried to do it.

As I loaded the scheme, the Moving Transform widget was set to create four (or three?) moving averages of one and the same variable. This would -- due to an error in the code -- create four variables with the same name.

It won't be difficult to fix, but the crucial part of the fix will be in Orange's core, not the add-on.