dotnet / machinelearning-modelbuilder

Simple UI tool to build custom machine learning models.
Creative Commons Attribution 4.0 International
265 stars 56 forks source link

Data classification: The generated Console App and Web API projects are running failed. #2418

Closed v-Hailishi closed 1 year ago

v-Hailishi commented 1 year ago

System Information (please complete the following information): Windows OS: Windows-11-Enterprise-22H2 ML.Net Model Builder 2022: 16.14.0.2261401 (Main Build) Microsoft Visual Studio Enterprise: 2022(17.4.1) .Net: 6.0

Describe the bug

Data Source https://testpass.blob.core.windows.net/test-pass-data/wikipedia-detox-250-line-data.tsv SQL Password

To Reproduce Steps to reproduce the behavior:

  1. Select Create a new project from the Visual Studio 2022 start window.
  2. Choose the C# Console App (.NET Core6.0) project template.
  3. Add model builder by right click on the project.
  4. Select Data classification scenario.
  5. On Data page, select the data source and choose Sentiment column as label.
  6. On the Train page, click "Start training" button to complete the training.
  7. Click "Add to solution" button on the Consume page to generate Console App project.
  8. Run the generated Console App project.

Expected behavior The generated Console App project should be running successfully.

Screenshots File: image SQL Password: image

Additional context:

v-Hailishi commented 1 year ago

The bug is also repro for Text classification scenario: The generated Console App project is running failed; the generated Web API project is running successfully. image

zewditu commented 1 year ago

Regressed by this PR https://github.com/dotnet/machinelearning-tools/pull/1624.

v-Hailishi commented 1 year ago

The bug is still repro on the latest main build: 16.14.0.2261601 File: image image SQL Password: image image

beccamc commented 1 year ago

Thanks for finding this @v-Hailishi!

Dev team - possible root cause is that this is binary classification?

v-Hailishi commented 1 year ago

Verified on the latest main build 16.14.0.2262002, the bug is repro on Data classification, Value predication, Recommendation, Forecasting, Text classification scenarios.

zewditu commented 1 year ago

For this image binary is not an issue , it happens for text classification types such as data-classification(binary and multi) and Text- classification but it is not consistence

zewditu commented 1 year ago

It is not reproduceable from the latest main image

image

zewditu commented 1 year ago

Code behind generation missed installation of System.Data.SqlClient image

zewditu commented 1 year ago

@v-Hailishi FYI: to validate code in the generated project you should use the newly mbconfig file, if you use old mbconfig file you might get some errors

JakeRadMSFT commented 1 year ago

We're going to change our code to convert to strings for now in AutoML pipelines and CodeGen. We'll also look into supporting other types in ML.NET.

@michaelgsharp has something that we've briefly tested and it works.

JakeRadMSFT commented 1 year ago

For AutoML we can try the following change:

https://github.com/dotnet/machinelearning/blob/a758217121c85cf5af9a2ea3f759feae2020d7b5/src/Microsoft.ML.AutoML/SweepableEstimator/Estimators/MapValueToKey.cs#L11

Change to something like this: .Append(mlContext.Transforms.Conversion.MapValueToKey(param.OutputColumnName, param.InputColumnName, addKeyValueAnnotationsAsText: true))

v-Hailishi commented 1 year ago

Verified on the latest main build 16.14.2.2306001, the generated Console App and Web API projects are still running failed on scenarios: Data classification, Value predication, Recommendation, Forecasting, Text classification scenarios. The following are the details:

1.1 File1 (https://testpass.blob.core.windows.net/test-pass-data/wikipedia-detox-250-line-data.tsv) Console App: image Web API: image

1.2 File2 (https://testpass.blob.core.windows.net/test-pass-data/issues.tsv.txt) Console App: Run successful

image

Web API:

image

1.3 Microsoft SQL Server: Binary (SQL Password) Console App: image Web API: image

1.4 Microsoft SQL Server Database File (https://github.com/dotnet/machinelearning-samples/raw/main/samples/modelbuilder/MulticlassClassification_RestaurantViolations/RestaurantScores.zip) Console App: Run successful image Web API: image

Console App: Run successful

image

Web API: File:

image

SQL:

image

Console App: Run successful

image

Web API: File:

image

SQL:

image

File:

image

SQL:

image

Console App:

image

Web API: File:

image

SQL:

image
v-Hailishi commented 1 year ago

Verified on the latest main build 16.14.2.2306101, the generated Console App and Web API projects are still running failed on scenarios: Data classification, Value predication, Recommendation, Forecasting, Image Classification-Local, Text classification scenarios. The following are the details:

1.1 File1 (https://testpass.blob.core.windows.net/test-pass-data/wikipedia-detox-250-line-data.tsv) Console App: Run successful

image

Web API: image

1.2 File2 (https://testpass.blob.core.windows.net/test-pass-data/issues.tsv.txt) Console App:

image

Web API:

image

1.3 Microsoft SQL Server: Binary (SQL Password) Console App: image Web API: image

1.4 Microsoft SQL Server Database File (https://github.com/dotnet/machinelearning-samples/raw/main/samples/modelbuilder/MulticlassClassification_RestaurantViolations/RestaurantScores.zip) Console App:

image

Web API: image

Web API: File:

image

SQL:

image

File:

image

SQL:

image

Console App:

image

Console App: Run successful

image

Web API: File:

image

SQL:

image
v-Hailishi commented 1 year ago

Verified on the latest main build 16.14.2.2306701, the generated Console App and Web API projects are still running failed on scenarios: Data classification, Value predication, Recommendation, Forecasting, Image Classification-Local, Object detection, Text classification scenarios. The following are the details:

1.1 File (https://testpass.blob.core.windows.net/test-pass-data/wikipedia-detox-250-line-data.tsv) Console App: Run successful Web API: Failed (After removing the unnecessary using directive, can run successful) image

1.2 Microsoft SQL Server: Binary (SQL Password) Console App: Failed

image

Web API: Failed

image

After removing the unnecessary using directive, still run failed.

image

1.3 Microsoft SQL Server Database File (https://github.com/dotnet/machinelearning-samples/raw/main/samples/modelbuilder/MulticlassClassification_RestaurantViolations/RestaurantScores.zip) Console App: Run successful Web API: Failed

image

After removing the unnecessary using directive, still run failed.

image

Web API: File: (After removing the unnecessary using directive, can run successful)

image

SQL:

image

After removing the unnecessary using directive, still run failed.

image

File: Console App:

image

Web API:

image

SQL: Console App:

image

Web API:

image

Web API: (After removing the unnecessary using directive, can run successful)

image

Web API: File: Failed (After removing the unnecessary using directive, can run successful)

image

SQL:

image

After removing the unnecessary using directive, still run failed.

image
v-Hailishi commented 1 year ago

Verified on the latest main build 16.14.2.2306903, the generated Console App and Web API projects are still running failed on scenarios: Data classification, Value predication, Recommendation, Forecasting, Text classification scenarios. The following are the details:

1.1 File (https://testpass.blob.core.windows.net/test-pass-data/wikipedia-detox-250-line-data.tsv) Console App: Run successful Web API: Run successful

1.2 Microsoft SQL Server: Binary (SQL Password) Console App: Failed image Web API: Failed

image

1.3 Microsoft SQL Server Database File (https://github.com/dotnet/machinelearning-samples/raw/main/samples/modelbuilder/MulticlassClassification_RestaurantViolations/RestaurantScores.zip) Console App: Run successful Web API: Failed (After choosing "Use local version '4.8.3'" can run successful.)

image

Web API: File: Run successful SQL: Failed (After choosing "Use local version '4.8.3'" can run successful.)

image

File: Console App: Failed (After choosing "using System.IO;" can run successful.)

image

Web API: Failed

image

SQL: Console App: Failed (After choosing "using Microsoft.ML.Data;" and "using System.IO;" and "Use local version '4.8.3'" can run successful.)

image

Web API: Failed

image
v-Hailishi commented 1 year ago

Verified on the latest main build 16.14.3.2307701, the generated Console App and Web API projects are still running failed on scenarios: Data classification.

zewditu commented 1 year ago

@v-Hailishi what do you mean SQL password? is that a dataset, are you able to share your dataset? thanks

v-Hailishi commented 1 year ago

@zewditu "SQL password" just means I use the SQL Server Authentication to log on to the server. image

Besides, in the database, I use the dataset "https://testpass.blob.core.windows.net/test-pass-data/wikipedia-detox-250-line-data.tsv".

image
beccamc commented 1 year ago

@zewditu This isn't regarding SQL. You can repro this by using a boolean type. I used the yelp dataset and set the prediction column to type boolean

image

v-Hailishi commented 1 year ago

The bug has been fixed on the latest main build 17.14.3.2310605. image