Closed famschopman closed 4 years ago
I'm interested in a solution to this also. It seems like a good way to reduce the number of features if you can identify which features are important.
@daholste: Do you think this simply needs to be cast into the right type which has .LastTransformer
as a property?
Possibly related comic: https://blog.toggl.com/build-horse-programming/
First and foremost, I love that comic, @justinormont
+1, the C# segment of the comic feels apropos. If you inspect the model in the debugger GUI, you should be able to navigate to the last transformer. Thru casting C# objects as you see them in the debugger, you could write lines of C# code that correspond to the navigation in the GUI
Of course, this is terribly hacky. Off-hand, I'm not aware of an officially supported / less hacky way to do this. It could be a great area of focus for future development
The following cast lets me access the LastTransformer, however I cannot use it for PFI until I provide a better type for predictor. Debugging I can see it is of type Microsoft.ML.Data.RegressionPredictionTransformer<Microsoft.ML.IPredictorProducing
//setup code similar to famschopman
RegressionExperiment experiment = mlContext.Auto().CreateRegressionExperiment(experimentSettings);
var experimentResults = experiment.Execute(split.TrainSet, split.TestSet);
var predictor = ((TransformerChain<ITransformer>)experimentResults.BestRun.Model).LastTransformer;
//this will not compile.
var permutationMetrics = mlContext.Regression.PermutationFeatureImportance(predictor, transformedData, permutationCount: 30);
The following compile error is produced.
The type arguments for method 'PermutationFeatureImportanceExtensions.PermutationFeatureImportance<TModel>(RegressionCatalog, ISingleFeaturePredictionTransformer<TModel>, IDataView, string, bool, int?, int)' cannot be inferred from the usage. Try specifying the type arguments explicitly.
See my analysis on https://github.com/dotnet/machinelearning/issues/3976 as well. These two issues feel like they are the same thing.
The only thing that was needed to make this build and run was to add the (TransformerChain<ITransformer>)
cast to the BestRun.Model
(recommended in https://github.com/dotnet/machinelearning/issues/3972#issuecomment-521288508 , and then add another cast to (ISingleFeaturePredictionTransformer<object>)
for the LinearPredictor
, and that would have been enough to let you run PFI:
RunDetail<BinaryClassificationMetrics> bestRun = experimentResult.BestRun;
TransformerChain<ITransformer> trainedModel = (TransformerChain <ITransformer>) bestRun.Model;
var predictions = trainedModel.Transform(testData);
var linearPredictor = (ISingleFeaturePredictionTransformer<object>)trainedModel.LastTransformer;
var permutationMetrics = mlContext.BinaryClassification.PermutationFeatureImportance(
linearPredictor, predictions, permutationCount: 30);
PS: There was a bug (#4517) when running PFI particularly with Binary classification models, so even after getting this running, if AutoML had returned a non-calibrated binary model, then running PFI would have thrown an exception. This bug got fixed on #4587 , which got included in ML.NET 1.5.0-preview2 and 1.5.0, so that is fixed.
See my analysis on #3976 as well. These two issues feel like they are the same thing.
The problem described there got fixed on #4262 and #4292. Still, that problem wasn't really causing this problem, as the solution I mentioned above would have worked even then. The problem you refer to is not being able to cast a model loaded from disk to their actual type (e.g. BinaryPredictionTransformer<ParameterMixingCalibratedModelParameters<IPredictorProducing<float>, ICalibrator>>
). After that problem got fixed, users can now cast to the actual type, but they could always cast to (ISingleFeaturePredictionTransformer<object>)
(which is more appropriate when using AutoML.NET since users won't know in advance the actual type of the model being returned by the experiment). So the point is that it was always possible to use PFI with AutoML by using the (ISingleFeaturePredictionTransformer<object>)
cast I described above.
Playing with AutoML and so far having much fun with it.
I have a trained model and now trying to retrieve the feature weights. None of the objects returned expose a LastTransformer object that I need to
Code snippet:
Then I want to get the PFI information and I get stuck. There appears no way to get the LastTransformer object from the trainedModel.
Hope someone can help me with some guidance.