Open christopherfowers opened 2 years ago
I would like to add to this, that using names instead of index for the columns does not change the outcome. And removing the header line in the csv then switching --has-header
to false has only the effect of changing the names applied to the generated model input. Those values are still expected in the input and still used by the model for predictions.
Hi @christopherfowers thanks for this issue. Since it's related to ML.NET tools, I'm moving it to the dotnet/machinelearning-modelbuilder repo.
I am having the same issue as @christopherfowers while trying to train a regression model through the CLI on Windows. Please can you advise on the solution to this problem? Thanks :)
System Information (please complete the following information):
Describe the bug When using the CLI from terminal to train a model using data from a csv file and specifying columns to ignore using
--ignore-cols 1, 2, 3
results in an output model and sample project that does in fact use the columns intended to be ignored as inputs for classification predictions.To Reproduce Steps to reproduce the behavior:
data.csv
. Include columns you don't with to be used in the predictions at all. (fill it with some meaningful data to train classification models.)mlnet classification --dataset "data.csv" --has-header true --train-time 10 --label-col 8 --ignore-cols 1, 2, 3, 4, 5, 6, 7, 9
(obviously this step should include the appropriate label column (0 indexed) and ignore columns (also 0 indexed))Expected behavior Generated model and sample project should not use columns listed in the
--ignore-cols
flag arguments.Actual behavior Each of the ignored columns are still used.