Open rzechu opened 1 year ago
@luisquintanilla do you know if we have any work planned around the sql stuff in the near future? I know its been something that has been brought up several times.
Something I think sticks out here is that the old AutoML APIs are being used. I would recomment using the new APIs. Here is a guide on that.
Re: autoconcat features, the Featurizer
can help you do that in the new API. The Featurizer
works best when paired with the InferColumns
method. Today that doesn't natively work with SQL but here's a sample that shows how you can get it to. I've also created an issue to enable SQL for InferColumns
.
Something I think sticks out here is that the old AutoML APIs are being used. I would recomment using the new APIs. Here is a guide on that.
Re: autoconcat features, the
Featurizer
can help you do that in the new API. TheFeaturizer
works best when paired with theInferColumns
method. Today that doesn't natively work with SQL but here's a sample that shows how you can get it to. I've also created an issue to enable SQL forInferColumns
.6515
Ok thank you for response. I saw samples and docs but 95% examples regarding text loader from CSV or predefined classes. I have to dynamically select database columns and column data types in runtime. That's why I choosed dynamically building SQL and recognizing datatypes. It works well for most of 90+%scenarios. No need to featurizer, concatenating, fit etc But there's some minor cases when API returns errors. I have tried 2.0 preview API. Different error but still error (regarding vector columns) I will investigate if further.
I have tried 2.0 preview API. Different error but still error (regarding vector columns)
As you run into issues, please file them here so we can investigate. Thanks.
Similar (SQL input data) but other type columns
System Information (please complete the following information):
Describe the bug I am trying to use Microsoft.ML.AutoML.MultiClassificationExperiment with preloaded SQLData
Stacktrace
Not working input
But on the other hand I have no problem with AutoML training with this dataset (single + string) as input Working input
To Reproduce Steps to reproduce the behavior:
var loader = MLContext.Data.CreateDatabaseLoader(columns.ToArray()); var dbSource = new DatabaseSource(SqlClientFactory.Instance, connectionString, sqlQuery); var iDataView = loader.Load(dbSource); experiment.Execute(trainData: iDataView, labelColumnName: "LabelColumn", progressHandler: progressHandler);
Expected behavior I understand there's problem with connectin string/int columns in input data... (same if int cols are first and strings are later). Why can't allow to auto concatenate all fields?