Closed JakeRadMSFT closed 4 years ago
So we have concept of IEstimator and ITransformer. see https://github.com/dotnet/machinelearning/blob/master/docs/code/MlNetHighLevelConcepts.md In general case you create pipeline out of chaining IEstimators together, then you fit them, and that produce TransformerChain. Since all your estimators are trivial (doesn't require pass through data to create internal state) you can chain them together without any fitting.
var model = new ImageLoaderTransformer(mlContext, imageFolder: imageFolderPath, columns: ("ImagePath", "ImageReal"))
.Append(new ImageResizerTransformer(mlContext, "ImageReal", "ImageReal", ImagePreprocessSettings.imageHeight, ImagePreprocessSettings.imageWidth))
.Append(new ImagePixelExtractorTransformer(mlContext, new[] { new ImagePixelExtractorTransformer.ColumnInfo("ImageReal", TensorFlowModelSettings.InputTensorName, interleave: ImagePreprocessSettings.channelsLast, offset: ImagePreprocessSettings.mean) }))
.Append(new TensorFlowTransformer(mlContext, modelFilePath, new[] { TensorFlowModelSettings.InputTensorName }, new[] { TensorFlowModelSettings.OuputTensorName }));
But you can do that trick only if all your pipeline estimators independent from data. (One of the easiest ways to tell is to check your estimator and what class it will return in case of fit and if you can see public constructor for that class - it doesn't required training).
Sorry for delay in response, hope I answer your question.
Yep! I couldn't get my code to format correctly to post the final solution here but it's essentially what you have above but switching input/output column locations. I also had to update to 0.10.0.
I also had to call model.CreatePredictionEngine<TrainTestData, PredictionProbability>(mlContext); to get the equivalent predict function.
Thanks!
One thing worth to mention, we currently in process of hiding as much stuff as possible and in 0.11 you wont be able to do this trick since Transformer constructor will stop being public. We aware what it's weird to call estimators on top of empty file, and we will address it, there is reference to proposal how to do that above. We just in a state if we expose too much, it can be nightmare to support later, so we trying to hide pretty much everything from user.
I'm not fully following. So what will it become? I find this to be fairly intuitive.
var model = new ImageLoaderTransformer(mlContext, imageFolder: imageFolderPath, columns: ("ImageReal", "ImagePath"))
.Append(new ImageResizerTransformer(mlContext, "ImageReal", ImagePreprocessSettings.imageWidth, ImagePreprocessSettings.imageHeight, "ImageReal"))
.Append(new ImagePixelExtractorTransformer(mlContext, new[] { new ImagePixelExtractorTransformer.ColumnInfo(TensorFlowModelSettings.InputTensorName, "ImageReal", interleave: ImagePreprocessSettings.channelsLast, offset: ImagePreprocessSettings.mean) }))
.Append(new TensorFlowTransformer(mlContext, modelFilePath, new[] { TensorFlowModelSettings.OuputTensorName }, new[] { TensorFlowModelSettings.InputTensorName }));
return model.CreatePredictionEngine<TrainTestData, PredictionProbability>(mlContext);
Re-opening to continue the discussion :)
https://github.com/dotnet/machinelearning/issues/1798#issuecomment-458845661 So we have this issue/discussion where we decided to hide all ctors for transformers and estimators.
From what I understand (I wasn't part of discussion) we have mlContext object and we want user instead of going through namespaces and documentation and search engine just call for mlContext.Transforms
and show all available estimators for him.
Somehow we also decided what any other way to construct estimators is should be hidden, and same goes to trivial (not required training) transformers. I wasn't part of discussion, so I don't know reasoning behind it, but I'm sure it exist.
We have this issue: https://github.com/dotnet/machinelearning/issues/2354, but I'm not sure it will save you from fake Fit call. At least I don't see that in issue.
Any updates on this?
@Ivanidzo4ka, have you heard/seen anything?
What is it supposed to be like in 1.1.0?
@JakeRadMSFT and @nfnpmc I am not not sure I follow your latest questions but it appears the previous questions have been answered. I am closing the issue for now. Please reopen with details if you need more info.
System information
Issue
What did you do? See code below. I'm loading a pre-trained TensorFlow model and I was working from the existing examples.
What happened? I didn't understand why the example was passing training/test data to get the prediction function (see code and comment below).
What did you expect? It seems like I should just be able create a pipeline with pre-processing steps and ScoreTensorFlowModel and then just get the predict function. To test this theory I tried making MulitFileSource(null) and everything works fine. If it's not needed ... Can you recommend different code? If it is needed ... it seems kind of odd.
Source code / logs