dotnet / machinelearning

ML.NET is an open source and cross-platform machine learning framework for .NET.
https://dot.net/ml
MIT License
9.05k stars 1.89k forks source link

Name of MakePredictionFunction is confusing (to me). #1761

Closed montebhoover closed 5 years ago

montebhoover commented 6 years ago

I was following the start example in the Readme, and when I came to the line with MakePredictionFunction, I got a bit confused and had to go over it a few times before I could continue:

Now from the model we can make inferences (predictions):

var predictionFunc = model.MakePredictionFunction<SentimentData, SentimentPrediction>(mlContext);
var prediction = predictionFunc.Predict(new SentimentData
{
    SentimentText = "Today is a great day!"
});
Console.WriteLine("prediction: " + prediction.Prediction);

I expected the return value of MakePredictionFunction() to a function-type thing, but instead it was an object. This was hard for me not because of pedantic correctness, but rather from a usability standpoint as a beginner trying to keep my concepts straight (estimators, transformers, models, etc.) this took me a step back and I had to go over the section several times to make sure I hadn't missed something.

Is there a better name that would be less confusing?

CESARDELATORRE commented 6 years ago

I agree, I provided that feedback some time ago. It is confusing for a C# or object oriented developer. MakePredictionFunction() is returning an object, not a function... ;)

That's why in some samples we were naming the variable/object as predictionEngine instead of predictionFunction.

I think it is called that way because in generic machine learning might be the term, but this is good feedback to have. 👍

@TomFinley - Thoughts?

TomFinley commented 6 years ago

I agree that "function" is an overloaded term that will confuse .NET developers.

I might actually prefer to "solve" this problem by deleting PredictionFunction entirely, and instead just using engine everywhere. (With appropriate renamings and whatnot to the initialization methods and extension methods for creation from ITransformer.) To coin a phase, where there's code, there's a problem. No code, no problem.

@Zruty0 appeared to introduce this class PredictionFunction in #882 to, I believe, have an alternate to prediction engine that worked over ITransformer as opposed to the old mechanism that worked over the now deprecated IDataTransform. Though, why a different type was necessary for this is not clear.

IIRC (though it's tough to be sure after 2.5 months), I think he and I both believed this new thing (PredictionFunction would be closely-related-to-but-distinct-from the PredictionEngine, and perhaps by the time @Zruty0 realized that in fact no new code was necessary to write at all, we were already so used to the idea that there would be a new class that we were too silly to reconsider, even when it was just a straight passthrough? 😄 I'm not sure.

In #973 I changed it so that both used IRowToRowMapper, so there is no longer even any functional difference even during initialization.

markusweimer commented 6 years ago

Two notes:

TomFinley commented 6 years ago

Hi @markusweimer ,

  • Should PredictionFunction implement Func?

We would want PredictionEngine, not PredictionFunction, but if you're asking if the engine can simply be a Func, not really, since there are actually two overloads to predict, one which returns the output type, and one which takes a reference to the output type and fills in the buffer -- the latter is of course how we do buffer sharing.

There is one other problem which we haven't quite tackled yet, but we should probably do pretty soon, and that is that IRow and the set of things that depend upon IRow must themselves implement IDisposable (they don't yet, but they really have to!). This is a subitem in #1532 -- one of the implications of the notes there is that this thing itself must implement IDisposable, since the intermediate objects can contain objects that should be handled this way. (E.g., tensorflow objects and whatnot, other native objects.)

  • Should the method be called CreatePredictionFunction to fit in with the prior art of using Create as the prefix for methods that create new objects?

Yes, Create, that's fine. But of course not prediction function.

Zruty0 commented 5 years ago

Actually, as heretical as it may sound, maybe we could now call it a 'predictor'?

TomFinley commented 5 years ago

Actually, as heretical as it may sound, maybe we could now call it a 'predictor'?

Maybe. IPredictor is a misnomer as has been adequately covered in many, many places, so once the name is freed up to the agreed name, IIRC IModelParameters or whatever it was, it could just be a predictor.