dotnet / machinelearning-modelbuilder

Simple UI tool to build custom machine learning models.
Creative Commons Attribution 4.0 International
263 stars 56 forks source link

Support more file separator formats #107

Closed rustd closed 4 years ago

rustd commented 5 years ago

Customer reported on the forum.

Model Builder does not handle correctly datatypes from csv files if Windows decimal separator etc. settings are different than en-US. Need to fix and support more different column and decimal separators in order to working with different data files. Column separator for example ; and decimal separator , as well.

Is your feature request related to a problem? Please describe. A clear and concise description of what the problem is. Ex. I'm always frustrated when [...]

Describe the solution you'd like A clear and concise description of what you want to happen.

Describe alternatives you've considered A clear and concise description of any alternative solutions or features you've considered.

Additional context Add any other context or screenshots about the feature request here.

greazer commented 5 years ago

@rustd, Is this an ML.Net AutoML issue?

rustd commented 5 years ago

No. The framework supports it and the tool needs to support this.

greazer commented 5 years ago

mlnet auto-train does infer the column separator, so the only real problem we have here is that our data preview is not doing the same inference. Once we get that right, the call to the CLI should just work.

rustd commented 5 years ago

Thomas mentioned that you are already working on a change in the PR for txt files.

LittleLittleCloud commented 5 years ago

that's what CLI used, maybe we can follow that https://docs.microsoft.com/en-us/dotnet/api/microsoft.ml.automl.columninferenceresults?view=automl-dotnet

LittleLittleCloud commented 5 years ago

See this PR

LittleLittleCloud commented 5 years ago

138

rustd commented 5 years ago

@LittleLittleCloud can you please specify which separators do we support?

LittleLittleCloud commented 5 years ago

Sure , currently we support space , comma, tab and semicolon

rustd commented 5 years ago

@LittleLittleCloud can you please demo this today at the standup.