biolab / orange3-timeseries

🍊 :chart_with_upwards_trend: Orange add-on for analyzing, visualizing, manipulating, and forecasting time series data.
Other
62 stars 41 forks source link

Moving Transform: Reimplementation #197

Closed janezd closed 2 years ago

janezd commented 2 years ago
Issue

Widget is outdated, buggy, occasionally suboptimal.

Description of changes

No longer relevant

Fixed elsewhere

Other improvements

To do

janezd commented 2 years ago

Would anybody be interested in checking this widget out? E.g. @ajdapretnar? It still needs testing, unit tests and documentation, but I'd like to have some feedback about its functionality first.

(As for settings migrations - this new widget merges two old ones, and also changes the way settings work, so there'll probably be no migration.)

ajdapretnar commented 2 years ago

I tested quite thoroughly, I think:

Not to be just critical. I think the widget is a big improvement over the previous one. I love the layout. We should think about renaming the widget, though, now that it includes more than a single transformation. In pandas, aggregation is, well, aggregation, moving transform is rolling or smoothing. Moving transform currently doesn't imply aggregation. Perhaps Transform Timeseries? Suggestions welcome.

janezd commented 2 years ago

Thanks for thorough testing and apologies for trivial mistakes that I should have discovered myself. (As excuse, I wanted an overview of functionality before writing unit tests, which would find at least some of those. Still, great thanks for testing!)

Here are some comments about thinks I would keep as they are.

if I increase the window beyond the size of the data, widget crashes with: IndexError: too many indices for array: array is 1-dimensional, but 2 were indexed. We should probably set max spin box value to N?

Setting the spin is more dangerous; it essentially introduces a context setting because the spin needs to be reset for new data. Instead, the widget will output a zero-length table and show a warning. This (minus the warning) already happens for some functions.

I probably don't understand what "Consecutive blocks" does. I thought it like a sliding window, but without overlaps. So like [[1, 2, 3, 4, 5], [6, 7, 8, 9, 10], [11, 12, 13, 14, 15]]. But regardless of what I do with block width, I only get a single instance on the output.

Can you give me a more concrete instructions for reproducing this? It works for me.

Aggregate time periods outputs "Defined count" by default, without it being selected in the aggregation functions. The UI should correspond to the output, i.e. "Defined count" should be marked in the list and the user should be able to deselect it.

This is not "Defined count" but "Count". It's the number of instances in a time slot. Checkboxes work for individual attributes, so this option does not belong there. It could be a checkbox in the left part, below the "Aggregate time periods" radio. But unless you strongly object against this column, I'd keep it. It doesn't hurt and it will often be useful.

Since transform options are discrete (radio buttons), wouldn't it be nicer to simply disable the aggregations that don't apply to Consecutive blocks?

Unfortunately no. These checkboxes allow you to control the currently selected attributes. If the user changes transform options, the previously chosen aggregations would still be in effect. The only option would be to disable checkboxes and remove those aggregations (i.e. what you see in the listview). This would be much simpler than current solution, but could lead to situations in which a user would set something up, decide to try different options and lose her settings. That's why aggregations stay but we show a warning.

If I disable checkboxes, I only prevent the user from deselecting options that have to be deselected to avoid the warning.

This problem made me think whether I should split this into three widgets ... but I decided against because they'd be basically the same.

ajdapretnar commented 2 years ago

Instead, the widget will output a zero-length table and show a warning.

Agree.

But unless you strongly object against this column, I'd keep it.

I am not saying it is useless, but it is counterintuitive to me to have something on the output that I didn't select in the settings. I don't strongly object, but it might be annoying for the user who wishes to keep exactly what she sets (thus requiring an additional Select Columns to get rid of it).

That's why aggregations stay but we show a warning.

Ok.

As for the Consecutive blocks, the issue is with the spin box. It looks like it takes the wrong spin box value, that is from the Sliding window not its own. I think this is the culprit: https://github.com/janezd/orange3-timeseries/blob/cb755dc94051912a496c88f04ec1cc75984ed14b/orangecontrib/timeseries/widgets/owmovingtransform.py#L478

codecov-commenter commented 2 years ago

Codecov Report

Merging #197 (b65306b) into master (bad9404) will decrease coverage by 3.38%. The diff coverage is 41.07%.

@@            Coverage Diff             @@
##           master     #197      +/-   ##
==========================================
- Coverage   73.20%   69.81%   -3.39%     
==========================================
  Files           7        7              
  Lines         821      858      +37     
  Branches      161      163       +2     
==========================================
- Hits          601      599       -2     
- Misses        168      210      +42     
+ Partials       52       49       -3     
Impacted Files Coverage Δ
orangecontrib/timeseries/functions.py 68.29% <41.07%> (-8.80%) :arrow_down:

Continue to review full report at Codecov.

Legend - Click here to learn more Δ = absolute <relative> (impact), ø = not affected, ? = missing data Powered by Codecov. Last update bad9404...b65306b. Read the comment docs.