Quasars / orange-spectroscopy

Other
52 stars 58 forks source link

[ENH] OWTileFile: Multi-input for Preprocessor chaining #638

Closed stuart-cls closed 1 year ago

stuart-cls commented 1 year ago

Implements #634, depends on #637 for OWIntegrate testing.

A few notes/items for discussion:

Feedback welcome!

codecov-commenter commented 1 year ago

Codecov Report

Merging #638 (8ff3f80) into master (f3250d8) will decrease coverage by 0.06%. The diff coverage is 73.33%.

@@            Coverage Diff             @@
##           master     #638      +/-   ##
==========================================
- Coverage   89.31%   89.24%   -0.07%     
==========================================
  Files          71       71              
  Lines       11817    11829      +12     
==========================================
+ Hits        10554    10557       +3     
- Misses       1263     1272       +9     
markotoplak commented 1 year ago
  * Coming back to [Integrate: Add "data" output which always contains spectra + results in meta #439](https://github.com/Quasars/orange-spectroscopy/pull/439), we had talked about just getting rid of this checkbox and have the Preprocessor output simply always do `move_metas=True` + `metas=True` (which is the equivalent of multi-method `metas=False` from Preprocess Spectra)

I am nor sure if we should really do that still. I think that what PCA widget does with its two semi-equivalent outputs is confusing. So the problem that #439 would aim to solve is to avoid a select columns (if results were in the metas) or a possible merge widget, if they were not, right?

Do I understand the following correctly: when using the Tile Reader the not-in-metas output is actually preferred because it saves memory when generating the final table? And that is the reason why we still have it?

* I found that the WidgetPreview doesn't support multi-input: a bug? I worked around it with some awkward syntax.

Yes, I also remember using the same syntax once. No-one thought of a nicer way, I think.

* I didn't update the code-copy from OWFile, if this is worth doing I will do in a separate PR

Which copy code do you mean?

stuart-cls commented 1 year ago

So the problem that https://github.com/Quasars/orange-spectroscopy/pull/439 would aim to solve is to avoid a select columns (if results were in the metas) or a possible merge widget, if they were not, right?

Yes, I think that was the idea. Honestly I can't remember the workflows where I was using this very well anymore, so I'm okay to drop it and revisit if we think of a good reason later.

Do I understand the following correctly: when using the Tile Reader the not-in-metas output is actually preferred because it saves memory when generating the final table? And that is the reason why we still have it?

Yes, that is the entire point of the Tile Reader. The integrate one is tricky because while the Integrate preprocessor supports multiple integration regions (this is what comes out of the Preprocess Spectra:Preprocessors output), the OWIntegrate widget does not (in order to support sequential independent integration regions). And so we have the PreprocessorListMoveMetas work-around. Which is all fine, I was just pointing out that the text representation in the info box in TileFile is now not quite accurate.

I should also say that I often use the not-in-metas output of the Integrate widget anyway, I find it conceptually simpler to think of it as a dimensionality-reduction widget. If I need the original spectra, I just use the nice Select by ID widget.

Which copy code do you mean?

OWTileFile is just a sub-classed OWFile. But OWFile is not allowed to be sub-classed, so I just copied the code from OWFile. It's not pretty and I think is giving some warnings because it now lags behind the current OWFile implementation.

markotoplak commented 1 year ago

I should also say that I often use the not-in-metas output of the Integrate widget anyway, I find it conceptually simpler to think of it as a dimensionality-reduction widget. If I need the original spectra, I just use the nice Select by ID widget.

I agree that this type of use is conceptually the cleanest. But might be hard to grasp. :)

markotoplak commented 1 year ago

I like this. Also, the included workflow makes this easier for users. Good job!

Now we should also make Test & Score (and company) behave the same way with regard to multi inputs.

I am merging this as is, but I have one suggestion for the future. It was not that clear to me that I should have reloaded the file when I change preprocessors. After a while, I did notice a warning within the preprocessor list. Perhaps it would be better to use widget warnings/info for that?

stuart-cls commented 1 year ago

Great!

I was sure I did have a widget warning for that :D I'm fixing it along with another bug I found, PR shortly