FluxML / MLJFlux.jl

Wrapping deep learning models from the package Flux.jl for use in the MLJ.jl toolbox
http://fluxml.ai/MLJFlux.jl/
MIT License
145 stars 17 forks source link

Sorting out issue with calling predict on matrix #219

Closed pat-alt closed 1 year ago

pat-alt commented 1 year ago

Closes #218

pat-alt commented 1 year ago

I've updated the scitype declarations for regression (here and here) and for classification (here). For testing, I thought I'd just rerun basic tests with a matrix input (see here). Unfortunately, I can't immediately make sense of the error I'm getting.

ablaom commented 1 year ago

Thanks for persisting with this.

This error usually means we're trying to grab the schema of an object that doesn't have one. I believe the culprit is here. We could add a second method dispatching on matrices to address. There are similar issues for shape in regressor.jl.

pat-alt commented 1 year ago

Thanks! I believe that's fixed now, though I did still get some warnings like this one:

Warning: The number and/or types of data arguments do not match what the specified model
│ supports. Suppress this type check by specifying `scitype_check_level=0`.
│ 
│ Run `@doc MLJFlux.NeuralNetworkRegressor` to learn more about your model's requirements.
│ 
│ Commonly, but non exclusively, supervised models are constructed using the syntax
│ `machine(model, X, y)` or `machine(model, X, y, w)` while most other models are
│ constructed with `machine(model, X)`.  Here `X` are features, `y` a target, and `w`
│ sample or class weights.
│ 
│ In general, data in `machine(model, data...)` is expected to satisfy
│ 
│     scitype(data) <: MLJ.fit_data_scitype(model)
│ 
│ In the present case:
│ 
│ scitype(data) = Tuple{Table{AbstractVector{Continuous}}, AbstractVector{Continuous}}
│ 
│ fit_data_scitype(model) = Tuple{Union{Table{<:AbstractVector{<:Continuous}}, AbstractMatrix{Continuous}}, AbstractVector{<:Finite}}

Edit: As for docstrings, I've updated for the classifier and regressors as follows:

- `X` is either a `Matrix` or any table of input features (eg, a `DataFrame`) whose columns are of scitype
  `Continuous`; check column scitypes with `schema(X)`. If `X` is a `Matrix`, it is assumed to have columns corresponding to features and rows corresponding to observations.
ablaom commented 1 year ago

Thanks! I believe that's fixed now, though I did still get some warnings like this one:

According to the warning you are providing the regressor with a Continuous target, rather than a Finite one (categorical vector).

ablaom commented 1 year ago

Sorry, rather the scitype declaration for the regressor is wrong. You need Continuous in the target_scitype where you appear to have Finite.

codecov-commenter commented 1 year ago

Codecov Report

Patch coverage: 100.00% and project coverage change: +0.14 :tada:

Comparison is base (452c09d) 92.73% compared to head (49d11af) 92.88%.

:mega: This organization is not using Codecov’s GitHub App Integration. We recommend you install it so Codecov can continue to function properly for your repositories. Learn more

Additional details and impacted files ```diff @@ Coverage Diff @@ ## dev #219 +/- ## ========================================== + Coverage 92.73% 92.88% +0.14% ========================================== Files 11 11 Lines 303 309 +6 ========================================== + Hits 281 287 +6 Misses 22 22 ``` | [Impacted Files](https://codecov.io/gh/FluxML/MLJFlux.jl/pull/219?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=Flux) | Coverage Δ | | |---|---|---| | [src/classifier.jl](https://codecov.io/gh/FluxML/MLJFlux.jl/pull/219?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=Flux#diff-c3JjL2NsYXNzaWZpZXIuamw=) | `100.00% <100.00%> (ø)` | | | [src/core.jl](https://codecov.io/gh/FluxML/MLJFlux.jl/pull/219?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=Flux#diff-c3JjL2NvcmUuamw=) | `94.87% <100.00%> (+0.13%)` | :arrow_up: | | [src/regressor.jl](https://codecov.io/gh/FluxML/MLJFlux.jl/pull/219?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=Flux#diff-c3JjL3JlZ3Jlc3Nvci5qbA==) | `100.00% <100.00%> (ø)` | | | [src/types.jl](https://codecov.io/gh/FluxML/MLJFlux.jl/pull/219?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=Flux#diff-c3JjL3R5cGVzLmps) | `100.00% <100.00%> (ø)` | | Help us with your feedback. Take ten seconds to tell us [how you rate us](https://about.codecov.io/nps?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=Flux). Have a feature suggestion? [Share it here.](https://app.codecov.io/gh/feedback/?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=Flux)

:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Do you have feedback about the report comment? Let us know in this issue.

pat-alt commented 1 year ago

Sorry, rather the scitype declaration for the regressor is wrong. You need Continuous in the target_scitype where you appear to have Finite.

You're right, thanks. Fixed that now, but I still get some broken tests (e.g. here. Is this expected?

ablaom commented 1 year ago

Fixed that now, but I still get some broken tests (e.g. here. Is this expected?

Yes. Some of the broken tests are because you are looking at CPU tests, where some tests are excluded and get marked as broken.

The remaining broken tests are explained here https://github.com/FluxML/MLJFlux.jl/issues/87 .

So no further action on your part needed here. I will finish my review shortly, thanks.

pat-alt commented 1 year ago

Thanks, good idea! I've made the necessary adjustments and updated tests accordingly. Had to make one additional small modification to the collate function. I've also updated the docstring.