Open lucazav opened 4 years ago
I encounter the same issue when i pull in data from Azure SQL database/ csv file from Data Lake. It pulls some of the factor data type as list. I have to unlist them everytime during training as well as scoring and do the conversions within the R script manually for now.
Tried to convert them in dataset within studio but then we cannot load an existing dataset in R Studio within azure as that is an ongoing bug.
@lucazav - Did you find a workaround for this? This is creating issue for me during scoring as well.
I am having the same issue right now with loading from parquet dataset to dataframe, one of the columns in the dataframe is pulled as list, when it's expected as Date format.
I have also noticed that the bug does not occur consistently across environments, in my compute instance I am able to load the dataframe with the correct datatypes, but when submitting to compute cluster the data type is loaded incorrectly.
I'm importing the Kaggle cars dataset from an Azure Blob Storage.
Looking at the inferred data types, I can see a strange list data type for the variable "Engine Fuel Type":
I also tried to use the following code:
But I'm getting the same result.
Is it a bug? If not, how can I avoid a list for that variable?