Open caitisgreat opened 3 years ago
The additions seem reasonable.
The request is to add scoreColumnName
and predictedLabelColumnName
to the Crossvalidate
API. These would control the name of the output columns when one of the crossval fold's models are run.
Work around
You can use the CopyColumns
transform (docs) to give the output Score
and PredictedLabel
columns new/unique names. The runtime/memory cost is negligible.
Most often I use a Concatenate
transform for this purpose (example), though it has the side-effect of upgrading the column to a vector type.
Side note: unless you want the metrics from the cross validation run, you're better off just fitting your model on the full dataset.
Side task for ML․NET We may want to add a full training run within the cross-validation.
Currently for 5-fold cross-validation, we run five folds of 80/20 train/validate; each returned model uses only 80% of the data. A user then has to pick one of these five models to use. We can run one extra model to return a final model fit on 100%. This allows a user to both get the metrics from CV, and have a better fit model.
System information
.NET SDK (reflecting any global.json): Version: 5.0.100 Commit: 5044b93829
Runtime Environment: OS Name: Windows OS Version: 10.0.18363 OS Platform: Windows RID: win10-x64 Base Path: C:\Program Files\dotnet\sdk\5.0.100\
Host (useful for support): Version: 5.0.0 Commit: cf258a14b7
.NET SDKs installed: 3.1.200 [C:\Program Files\dotnet\sdk] 5.0.100 [C:\Program Files\dotnet\sdk]
.NET runtimes installed: Microsoft.AspNetCore.All 2.1.16 [C:\Program Files\dotnet\shared\Microsoft.AspNetCore.All] Microsoft.AspNetCore.App 2.1.16 [C:\Program Files\dotnet\shared\Microsoft.AspNetCore.App] Microsoft.AspNetCore.App 3.1.2 [C:\Program Files\dotnet\shared\Microsoft.AspNetCore.App] Microsoft.AspNetCore.App 5.0.0 [C:\Program Files\dotnet\shared\Microsoft.AspNetCore.App] Microsoft.NETCore.App 2.1.16 [C:\Program Files\dotnet\shared\Microsoft.NETCore.App] Microsoft.NETCore.App 3.1.2 [C:\Program Files\dotnet\shared\Microsoft.NETCore.App] Microsoft.NETCore.App 5.0.0 [C:\Program Files\dotnet\shared\Microsoft.NETCore.App] Microsoft.WindowsDesktop.App 3.1.2 [C:\Program Files\dotnet\shared\Microsoft.WindowsDesktop.App] Microsoft.WindowsDesktop.App 5.0.0 [C:\Program Files\dotnet\shared\Microsoft.WindowsDesktop.App]
ML.NET Package Version: v1.5.2
Request
Would it be possible to include the same parameterized column names from the Evaluate method (Multiclass/Binary Classifiers) in the CrossValidate method? I'm performing a bunch of column manipulation in order to distinguish elements in a sequenced ML pipeline (performing multiclass classification then sentiment analysis)
Source code / logs