dotnet / machinelearning

ML.NET is an open source and cross-platform machine learning framework for .NET.
https://dot.net/ml
MIT License
9.05k stars 1.89k forks source link

How to make prediction of custom data with loaded model? Problem with Model.CreatePredictionEngine<TSrc, TDst> Is it possible to avoid strongly typed input/output #6280

Closed rzechu closed 2 years ago

rzechu commented 2 years ago

System information

Issue

Model.CreatePredictionEngine<TSrc, TDst> requires strongly typed input and outputs. Is there any workaround? I'd like to put predictions model on custom forms with custom fields. I can load it, I know the fields used to predict/label but Its not possible without developer writing custom methods for each form type and recompilling application.

dakersnar commented 2 years ago

Currently it's not possible to do a dynamic prediction engine. This is something that is high on our priority list.

There is a workaround if you are ok with not using the prediction engine. You would basically need to manually do what the prediction engine does under the hood. Are you interested in this approach? I can elaborate as needed.

ghost commented 2 years ago

This issue has been marked needs-author-action and may be missing some important information.

LittleLittleCloud commented 2 years ago

What I usually do as an alternative way to predicted the load model using PredictedEngine is just calling ITransformer.Transform. Which returns an IDataView predicted from model.

For example, if your model is trained using binary classification trainer, and predictedLabel name is PredictedLabel, what you can do is

ITransformer model;
IDataView input;
var result = model.Transform(input);

// mapping label to boolean type.
IEnumerable<bool> predictedColumn = result.GetColumn<bool>("PredictedLabel");
rzechu commented 2 years ago

Currently it's not possible to do a dynamic prediction engine. This is something that is high on our priority list.

There is a workaround if you are ok with not using the prediction engine. You would basically need to manually do what the prediction engine does under the hood. Are you interested in this approach? I can elaborate as needed.

Yes please. I am interested in any solution which allow us to dynamicaly use previously generated models

dakersnar commented 2 years ago

Does the method proposed by @LittleLittleCloud fulfill your needs?

ghost commented 2 years ago

This issue has been marked needs-author-action and may be missing some important information.

rzechu commented 2 years ago

Hmm I have tried it and looks like it works, but I have to think about schema saving/regenerating for input.

I still have to deliver data in same schema as saved inside model. I am using something like "fake sql statement (not using tabels - just build query with string builder) this to get data/columns dynamically

if (ColumnsToIgnore == null)
   ColumnsToIgnore = new List<string>();
var schema = Tools.GetDataBaseSchemaColumns(sqlQuery, connectionString);

Microsoft.ML.Data.DatabaseLoader loader = mlContext.Data.CreateDatabaseLoader(columns.ToArray());
string sqlQuery = @"SELECT 
'DataToPredict' as Column1,
0 as Label";  //this can be easily generated by StringBuilder to fit another scenarios - just need to know model schema columns and data types
Microsoft.ML.Data.DatabaseSource dbSource = new 
Microsoft.ML.Data.DatabaseSource(System.Data.SqlClient.SqlClientFactory.Instance, connectionString, sqlQuery);
var input= loader.Load(dbSource);
var result = loadedModel.Model.Transform(input); 
var predictedColumn = result.GetColumn<int>("PredictedLabel");
foreach (var item in predictedColumn)
{   
   Console.WriteLine(item); //looks like this works
}

Console.WriteLine(predictedColumn[0]); //Error CS0021  Cannot apply indexing with[] to an expression of type 'IEnumerable<int>

        public static DataTable GetDataBaseSchemaColumns(string sqlQuery, string ConnectionString)
        {
            DataTable schema = null;
            using (SqlConnection connection = new SqlConnection(ConnectionString))
            using (SqlCommand command = connection.CreateCommand())
            {
                command.CommandText = sqlQuery;
                connection.Open();

                schema = command.ExecuteReader(CommandBehavior.SchemaOnly).GetSchemaTable();
            }

            return schema;
        }

public static IList<DatabaseLoader.Column> MapDataTableToDataBaseLoaderColumns(DataTable DataSchema)
        {
            var mappedColumns = DataSchema.Rows.Cast<DataRow>()
                    .Select(row => MapDataTypeToSqlDataType(row)).Where(w => w.Type != DbType.Binary).ToList();
            return mappedColumns;
        }

        private static Microsoft.ML.Data.DatabaseLoader.Column MapDataTypeToSqlDataType(DataRow DataRow)
        {
            Microsoft.ML.Data.DatabaseLoader.Column mlDataLoaderColumn = new Microsoft.ML.Data.DatabaseLoader.Column();
            mlDataLoaderColumn.Name = DataRow.Field<string>("ColumnName");
            //mlDataLoaderColumn.Source = DataRow.Field<string>("ColumnName");

            try
            {
                var dataType = DataRow.Field<Type>("DataType");
                switch (dataType.Name)
                {
                    case "String":
                        { mlDataLoaderColumn.Type = DbType.String; break; }
                    case "Int32":
                        { mlDataLoaderColumn.Type = DbType.Int32; break; }
                    case "Decimal":
                    case "Double":
                        { mlDataLoaderColumn.Type = DbType.Single; break; }
                    case "DateTime":
                        { mlDataLoaderColumn.Type = DbType.DateTime; break; }
                    case "Byte[]":
                        { mlDataLoaderColumn.Type = DbType.Binary; break; }
                    default:
                        throw new Exception($"{dataType} can't be mapped to SqlDbType");
                }
            }
            catch (Exception)
            {
                throw;
            }

            return mlDataLoaderColumn;
        }
LittleLittleCloud commented 2 years ago

I still have to deliver data in same schema as saved inside model

@rzechu I'm not quite understanding here, do you want to make your model able to predict on dynamic data instead of data with fixed schema?

LittleLittleCloud commented 2 years ago

@rzechu , I'm going to close it due to no active response in 7 days, feel free to reopen it if you have any questions.