dotnet / machinelearning

ML.NET is an open source and cross-platform machine learning framework for .NET.
https://dot.net/ml
MIT License
8.99k stars 1.88k forks source link

ML.NET to use appsettings.json for seed and concurrency level #1202

Open Zruty0 opened 5 years ago

Zruty0 commented 5 years ago

When LocalEnvironment is created using default constructor, it should look into appsettings.json for random seed and concurrency level.

(Original issue description below)

In case an ML.NET model is already trained and persisted, we can still control to some degree the behavior of the model at prediction time. For example, we could:

etc. We could consider having a ML.NET dedicated session in app.config for the application to control these, or we could use system environment variables, or we could have a 'ML.NET config' text file that the MLContext will initialize with.

This came up in the process of the API discussion, so creating the issue for future consideration.

justinormont commented 5 years ago

Would this be something we store within the model or next to it? Generally, I prefer having models be as self-contained as possible in a zip file (plus the runner application). Another route could be auto-creating a folder for the user to ensure they know which parts are needed for the model.

Two additional config file niceties: resource file paths & download locations (like a large word embedding file)

Zruty0 commented 5 years ago

I think the 'config' I have in mind is more like a running environment config, that can change from machine to machine, but rarely from model to model. So no, this is separate and orthogonal to model.

CESARDELATORRE commented 5 years ago

Why do you need a "native" config file for ML.NET app environment? I think that a good approach would be to use the regular .NET config file: the appsettings.json then you use those provided values by the json config file and use it into the API.

For instance, I made the GitHub Issues labeler here to use the appsettings.json where you can provide the info to get access to GitHub: https://github.com/dotnet/machinelearning-samples/blob/master/samples/csharp/end-to-end-apps/github-labeler/GitHubLabeler/Program.cs

In a similar fashion you could specify environment settings into the appsettings.json to be used by the ML.NET API.

But re-inventing another config file type just for ML.NET might be confusing. .NET Apps should use the regular .NET config files that can also be overridden by OS ENVIRONMENT VARIABLES, which is very common when running Docker Containers, etc.

In regards the API, I believe Windows ML has something comparable in regards CPU/GPU, etc. where you can say what device to use (CPU, GPU, etc.): https://docs.microsoft.com/en-us/windows/ai/integrate-model#choose-a-device

But it is not using a config fila but directly as paramenters in the API: https://docs.microsoft.com/en-us/uwp/api/windows.ai.machinelearning.learningmodeldevicekind

    LearningModel _model;
    LearningModelSession _session;

    try
    {
        // Load and create the model
        var modelFile = 
            await StorageFile.GetFileFromApplicationUriAsync(new Uri($"ms-appx:///Assets/{_modelFileName}"));
        _model = await LearningModel.LoadFromStorageFileAsync(modelFile);

        // Select the device to evaluate on
        LearningModelDevice device = null;
        if (_useGPU)
        {
            // Use a GPU or other DirectX device to evaluate the model.
            device = new LearningModelDevice(LearningModelDeviceKind.DirectX);
        }
        else
        {
            // Use the CPU to evaluate the model.
            device = new LearningModelDevice(LearningModelDeviceKind.Cpu);
        }

        // Create the evaluation session with the model and device.
        _session = new LearningModelSession(_model, device);

It might be interesting to have a similar approach so .NET developers that know WinML would see it familiar in ML.NET, or viceversa.

Zruty0 commented 5 years ago

Does ASP.NET also subscribe to the same appsettings.json mechanism? Or does it have a 'sidekick config file'?

CESARDELATORRE commented 5 years ago

A similar appsettings.json file can be used for regular .NET Core console apps or for ASP.NET Core applications. It is "the same" appsettings.json but logically, the values you have will be different because the web apps will need different values.

But the appsettings.json is only used by .NET Core applications, not for traditional .NET Framework and ASP.NET apps which use the Web.config or App.config XML files.

The cool thing when using the appsettings.json and the configuration class like Configuration["MyValue"];, is that the same values can be overridden transparently by environment variables, either in a process directly running on Windows or Linux, or using environment variables with Docker containers, which is pretty common when having multiple environments (dev - integration - production).

The point is that the config files approaches depend on the application type the developer is creating. I think that our API should take the values coming from the configuration-source defined by the standard way to do it depending on the .NET application type (.NET Core vs. .NET Framework).

Zruty0 commented 5 years ago

OK, that makes sense. I like appsettings.json better anyway. For a short-term work item, we could make our LocalEnvironment check appsettings.json for random seed and concurrency level, if it's not set at construction time.

SARAVANA1501 commented 3 years ago

I would like to contribute this, Will do little research and post if I have any questions.

frank-dong-ms-zz commented 3 years ago

@SARAVANA1501 Sure, thanks for contribute to ML.NET!

SARAVANA1501 commented 3 years ago

Few things I would like to understand