Closed dliedke closed 3 years ago
I have essentially the same issue. Visual Studio Version 16.8.3 (don't know how to determine model builder version) Scenario: Value prediction (regression) Environment: Local CPU Data: ...Input: SQL Server ...Table: I select a view that returns 4 columns, all CAST as REAL in the view definition on SQL Server. After about a second I get the same message. My log file shows:
2021-01-11 17:08:05.0007 TRACE Disposing AutoMLService Client (Microsoft.ML.ModelBuilder.AutoMLServiceFactory+AutoMLServiceProxy.Dispose)
2021-01-11 17:08:11.2606 TRACE Disposing AutoMLService Client (Microsoft.ML.ModelBuilder.AutoMLServiceFactory+AutoMLServiceProxy.Dispose)
2021-01-11 17:08:12.3988 DEBUG System.FormatException: Input string was not in a correct format.
at System.Number.StringToNumber(String str, NumberStyles options, NumberBuffer& number, NumberFormatInfo info, Boolean parseDecimal)
at System.Number.ParseInt32(String s, NumberStyles style, NumberFormatInfo info)
at System.Convert.ToInt32(String value)
at Microsoft.ML.ModelBuilder.DataSources.SQLDataSource.GetTableSize(String tableName)
at Microsoft.ML.ModelBuilder.ToolWindows.DataTabDataContext.
Note that if I select a DB table, it seems to be OK (at least to this point).
Ping @beccamc
Can confirm this is number formatting bug related to regions that uses comma as a decimal separator.
You mentioned region bugs was fixed, but this happens in latest v16.3.0.2056001.
To reproduce bug and the workaround:
2021-02-01 14:24:17.2759 TRACE Disposing AutoMLService Client (Microsoft.ML.ModelBuilder.AutoMLServiceFactory+AutoMLServiceProxy.Dispose)
2021-02-01 14:24:20.5782 DEBUG System.FormatException: Input string was not in a correct format.
at System.Number.StringToNumber(String str, NumberStyles options, NumberBuffer& number, NumberFormatInfo info, Boolean parseDecimal)
at System.Number.ParseInt32(String s, NumberStyles style, NumberFormatInfo info)
at System.Convert.ToInt32(String value)
at Microsoft.ML.ModelBuilder.DataSources.SQLDataSource.GetTableSize(String tableName)
at Microsoft.ML.ModelBuilder.ToolWindows.DataTabDataContext.<GetFileDataPropertiesFromSQLAsync>d__160.MoveNext()
--- End of stack trace from previous location where exception was thrown ---
at System.Runtime.CompilerServices.TaskAwaiter.ThrowForNonSuccess(Task task)
at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)
at Microsoft.ML.ModelBuilder.ToolWindows.DataTabDataContext.<<TriggerGetFileDataPropertiesFromSql>b__161_0>d.MoveNext() (Microsoft.ML.ModelBuilder.Utils.Logger.Debug)
2021-02-01 14:24:28.4323 TRACE Disposing AutoMLService Client (Microsoft.ML.ModelBuilder.AutoMLServiceFactory+AutoMLServiceProxy.Dispose)
2021-02-01 14:24:28.7864 DEBUG Error while processing your data: Input array is longer than the number of columns in this table.
Most likely your data contains multiple instances of your delimiter. Check your dataset for text fields containing ",". To fix, you can enclose these fields with quotes.
at System.Data.DataTable.NewRecordFromArray(Object[] value)
at System.Data.DataRowCollection.Add(Object[] values)
at Microsoft.ML.ModelBuilder.ToolWindows.DataTabDataContext.<CreateDataTableAsync>d__174.MoveNext() (Microsoft.ML.ModelBuilder.Utils.Logger.Debug)
2021-02-01 14:25:24.1303 INFO | Trainer RSquared Absolute-loss Squared-loss RMS-loss Duration #Iteration | (Microsoft.ML.ModelBuilder.Utils.Logger.Info)
2021-02-01 14:25:47.4654 TRACE Disposing AutoMLService Client (Microsoft.ML.ModelBuilder.AutoMLServiceFactory+AutoMLServiceProxy.Dispose)
2021-02-01 14:25:47.4654 DEBUG Specified cast is not valid.
at System.Data.SqlClient.SqlBuffer.get_Single()
at Microsoft.ML.Data.DatabaseLoader.Cursor.<>c__DisplayClass37_0.<CreateSingleGetterDelegate>b__0(Single& value)
at Microsoft.ML.AutoML.DatasetDimensionsUtil.HasMissingNumericSingleValue(IDataView data, Column column)
at Microsoft.ML.AutoML.DatasetDimensionsApi.CalcColumnDimensions(MLContext context, IDataView data, Column[] purposes)
at Microsoft.ML.AutoML.DatasetColumnInfoUtil.GetDatasetColumnInfo(MLContext context, IDataView data, ColumnInformation columnInfo)
at Microsoft.ML.AutoML.ExperimentBase`2.ExecuteTrainValidate(IDataView trainData, ColumnInformation columnInfo, IDataView validationData, IEstimator`1 preFeaturizer, IProgress`1 progressHandler)
at Microsoft.ML.AutoML.ExperimentBase`2.Execute(IDataView trainData, ColumnInformation columnInformation, IEstimator`1 preFeaturizer, IProgress`1 progressHandler)
at Microsoft.ML.ModelBuilder.AutoMLService.Experiments.AutoMLExperiment`3.<>c__DisplayClass21_0.<ExecuteAsync>b__5() in /_/src/Microsoft.ML.ModelBuilder.AutoMLService/Experiments/AutoMLExperiment.cs:line 80
at System.Threading.Tasks.Task`1.InnerInvoke()
at System.Threading.Tasks.Task.Execute()
--- End of stack trace from previous location where exception was thrown ---
at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw()
at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)
at Microsoft.ML.ModelBuilder.AutoMLService.Experiments.AutoMLExperiment`3.<ExecuteAsync>d__21.MoveNext() in /_/src/Microsoft.ML.ModelBuilder.AutoMLService/Experiments/AutoMLExperiment.cs:line 107
--- End of stack trace from previous location where exception was thrown ---
at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw()
at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)
at Microsoft.ML.ModelBuilder.AutoMLEngine.<StartTrainingAsync>d__26.MoveNext() in /_/src/Microsoft.ML.ModelBuilder.AutoMLService/AutoMLEngineService/AutoMLEngine.cs:line 150 (Microsoft.ML.ModelBuilder.Utils.Logger.Debug)
@Sire Thank you for the detailed repro! I will take a look
Just an idea for the fix, the code in method Microsoft.ML.ModelBuilder.DataSources.SQLDataSource() is failing for the view because "data" column returns null as the view does not have a size in the database.
public long GetTableSize(string tableName)
{
string cmdText = $"EXEC sp_spaceused {tableName}";
DataSet dataSet = new DataSet();
using (SqlConnection sqlConnection = new SqlConnection(DataProtection.DecryptString(EncryptedConnectionString)))
{
sqlConnection.Open();
using (SqlCommand sqlCommand = new SqlCommand(cmdText, sqlConnection))
{
sqlCommand.ExecuteNonQuery();
new SqlDataAdapter(sqlCommand).Fill(dataSet);
}
sqlConnection.Close();
}
return Convert.ToInt32(dataSet.Tables[0].Rows[0]["data"].ToString().Replace("KB", string.Empty).Trim()) * 1000;
}
One crazy idea that works is copying the data to real temporary table, getting the size and dropping it:
SELECT *
INTO tablename
FROM [VIEW]
EXEC sp_spaceused 'tablename'
DROP TABLE tablename
Probably there are better ways to measure the query result size, but this is one that I found that works.
Thanks! Daniel
@dliedke Thanks for the detailed fix! 😄 A "fix" for this will be in our March release (I noticed the problem and started catching the exception and returning a default value... not as helpful as your fix but I'll make a note to look at improving it!) I'll update this issue when the fix ships and you can tell me if it works.
As a note, I was only able to reproduce this with an Azure SQL DB. Are you using Azure or something local?
Hi Becca! Welcome! I used a local SQL database. Very nice to know the fix will be released soon. Thank you!
Hi everyone, we've shipped a new version of Model Builder. Can you check to see if this is fixed? https://marketplace.visualstudio.com/items?itemName=MLNET.07
Hi team, I rested here and worked fine. Thanks a lot for the fix!
System Information (please complete the following information):
Describe the bug
To Reproduce Steps to reproduce the behavior: 1) Scenario Text Classification 2) Data SQL Server 3) Select a view in Table dropdown
2020-12-05 09:06:57.7597 TRACE Disposing AutoMLService Client (Microsoft.ML.ModelBuilder.AutoMLServiceFactory+AutoMLServiceProxy.Dispose) 2020-12-05 09:07:07.6815 DEBUG System.FormatException: Input string was not in a correct format. at System.Number.StringToNumber(String str, NumberStyles options, NumberBuffer& number, NumberFormatInfo info, Boolean parseDecimal) at System.Number.ParseInt32(String s, NumberStyles style, NumberFormatInfo info) at System.Convert.ToInt32(String value) at Microsoft.ML.ModelBuilder.DataSources.SQLDataSource.GetTableSize(String tableName) at Microsoft.ML.ModelBuilder.ToolWindows.DataTabDataContext.d160.MoveNext()
--- End of stack trace from previous location where exception was thrown ---
at System.Runtime.CompilerServices.TaskAwaiter.ThrowForNonSuccess(Task task)
at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)
at Microsoft.ML.ModelBuilder.ToolWindows.DataTabDataContext.<b 161_0>d.MoveNext() (Microsoft.ML.ModelBuilder.Utils.Logger.Debug)
Expected behavior Load view data correctly and display in the UI.
Screenshots If applicable, add screenshots to help explain your problem.
Additional context It is was working in Model Builder Version 16.1.2027905 and previous versions without issues.