rca22 / LightGBM.Net

.Net wrapper around the LightGBM library
MIT License
15 stars 4 forks source link

To run it in fully Managed C#: Explanation on how to use the code. This information should be added to the repository #7

Open 78Spinoza opened 2 months ago

78Spinoza commented 2 months ago

Hello there First many thanks for the source code.
You can train your model in phyton or R , whatever and save it in native format. To run it in fully Managed C#: Explanation on how to use the code should be added to the repository Below is what I did to make it work and some small issues that needed to be resolved to get best performance.

Using LightGBM with C# in Fully Managed Code

1. Install the LightGBMNet.Train Package

2. Load a Native LightGBM Model

3. Validate the Model

4. Transform the Model to Fully Managed Code

5. Validate the Managed Model

6. Save and Load the Managed Model

7. Use the Managed Model in Production

By following these steps, you can effectively use LightGBM with C# in a fully managed environment, ensuring compatibility and performance across different platforms.

mjmckp commented 2 months ago

Thanks @78Spinoza, however these instructions are not how the library is intended to be used. See the unit tests to see how the native GBDTs can be trained, converted to managed objects, and evaluated. I do not recommend using the output of Booster.GetModel, as the ensemble object does not have all the necessary transformations on the output required for binary/multiclass model evaluation.

78Spinoza commented 2 months ago

I checked the unittest. I would like to train the model in Phyton since there are man many visualization and hyperparameter tuning that exist. How can I use a trained model and not Booster.GetModel ? I know that it only works for regression but internally all classification and other are regression for LightGBM but I understand what you say.. Can you please provide some example more simple as I did above?

mjmckp commented 2 months ago

@78Spinoza I'll have a look at how best to do this and let you know.

mjmckp commented 1 month ago

@78Spinoza The correct way to load externally trained models from file is shown inthe unit test LoadExternalModels. This works for all model types (regression, binary, multiclass, and ranking) and will ensure the managed model produces exactly the same output as the native model.

Also, to save the managed model to file, use PredictorPersist.Save, and to load the model from file use PredictorPersist.Load. Examples of this are in TrainerTest.cs.

78Spinoza commented 1 month ago

But I want to load for inference not training ? var regression = RegressionTrainer.PredictorsFromFile(Path.Combine(path, "models", "regression_model.txt") ???

mjmckp commented 1 month ago

@78Spinoza In the example above, "regression_model.txt" is a saved trained model (not training data), as generated by the native lightgbm library, and may be used for inference. The training may have been done elsewhere, e.g., in a Python script.

78Spinoza commented 1 month ago

Yes, many thanks but the class RegressionTrainer.PredictorsFromFile is in the LightGBMNet.Train namespace and we need to load the dll wrapper.. no? We need unmanaged code for inference..

mjmckp commented 1 month ago

Ok, I think I understand what you are trying to do: load an externally trained model directly into managed code without requiring any references to the native library. I've refactored the code to allow this (see latest commits) and uploaded a new NuGet package (1.0.22) to allow this. See the unit test LoadExternalModelsManagedOnly for an example.