Closed victoriatolls closed 5 years ago
Can you download the dataset file again and try again just in case the file was corrupted?
Still getting the same error even with the new file downloaded
Can you try to save it as wikipedia-detox-250-line-data.tsv instead of .csv extension?
.tsv worked
Thank you!
On Fri, May 10, 2019 at 4:05 PM prathyusha12345 notifications@github.com wrote:
Can you try to save it as wikipedia-detox-250-line-data.tsv instead of .csv extension?
— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/dotnet/machinelearning-samples/issues/441#issuecomment-491414176, or mute the thread https://github.com/notifications/unsubscribe-auth/AMBJC7NWCK5EK47YRTPVGX3PUXIRHANCNFSM4HMFUJ4Q .
Great!!! .csv is for comma separated files not for tab separated files. I am closing the issue.
Thanks.
Was this in Visual Studio (Model Builder)? Or the mlnet CLI?
The CLI ignores the file extension (only looks at the data itself). I'm unsure how Model Builder handles it.
Perhaps Model Builder could just iterate over the TSV/CSV option?
I am uploading .tsv file, but still its throwing same error. Can you please help me here.
I'm using SQL Server and having the same issue.
I'm using SQL Server and having the same issue.
Same here.
EDIT: The issue was on a column that needed to be stripped of new lines, tabs and html tags.
REPLACE(REPLACE(REPLACE('<my text>', CHAR(9), ''), CHAR(10), ''), CHAR(13), '')
I'm using SQL Server and having the same issue.
Same here.
EDIT: The issue was on a column that needed to be stripped of new lines, tabs and html tags.
REPLACE(REPLACE(REPLACE('<my text>', CHAR(9), ''), CHAR(10), ''), CHAR(13), '')
@graposo1 Do you have any idea how can I debug this issue for my dataset?
@habib-developer and @graposo1 - this is a closed issue, and normally people aren't tracking responses to closed issues.
Can one of you open a new issue with steps to recreate the issue you are seeing? If possible, uploading a .zip file of your project, or putting it on a GitHub repo, is the fastest way to getting someone to help you. Since you are using SQL Server, please include scripts to create the database table, and data that will reproduce the issue.
Do you have any idea how can I debug this issue for my dataset?
Debugging ML.NET is fairly simple. Here's the steps to take:
Enable Just My Code
in VS, Tools -> Options -> Debugging -> General.Microsoft Symbol Server
in VS, Tools -> Options -> Debugging -> Symbols.Now you can debug into ML.NET code, let the exception be thrown, and you can see which line and what the variables were at that time.
@habib-developer and @graposo1 - this is a closed issue, and normally people aren't tracking responses to closed issues.
Can one of you open a new issue with steps to recreate the issue you are seeing? If possible, uploading a .zip file of your project, or putting it on a GitHub repo, is the fastest way to getting someone to help you. Since you are using SQL Server, please include scripts to create the database table, and data that will reproduce the issue.
Do you have any idea how can I debug this issue for my dataset?
Debugging ML.NET is fairly simple. Here's the steps to take:
- Uncheck
Enable Just My Code
in VS, Tools -> Options -> Debugging -> General.- Enable the
Microsoft Symbol Server
in VS, Tools -> Options -> Debugging -> Symbols.- When prompted to download the source code using Source Link, allow it to be downloaded.
Now you can debug into ML.NET code, let the exception be thrown, and you can see which line and what the variables were at that time.
@eerhardt I'm using ML.NET Model Builder to build my ML model.
@eerhardt Will do in the future. I was able to fix this issue by replacing newlines from my data. I was searching for similar issues with csvs and it seems to be a common mistake.
Thx.
The issue is with the CSV file, newlines are causing this problem even in from well know datasets from Kaggle. Try to remove the newlines or and bad formating from the CSV file and then run it. I used notepad++ to manually check the dataset to solve this issue.
Problem encountered on https://dotnet.microsoft.com/learn/machinelearning-ai/ml-dotnet-get-started-tutorial/data Operating System: windows
I just downloaded ML.NET and was working through the tutorial but have encountered a problem. The data file provided (when downloaded and saved as .csv) returns 'Unable to split file provided into multiple consistent columns' Error in the Training phase.