dotnet / machinelearning

ML.NET is an open source and cross-platform machine learning framework for .NET.
https://dot.net/ml
MIT License
9.02k stars 1.88k forks source link

Can I get an example for AutoML with in memory data in F# #6930

Closed thomasd3 closed 9 months ago

thomasd3 commented 9 months ago

I'm trying to do this:

mlnet classification --dataset output.csv --label-col 0 --has-header true --name test --train-time 300

from F#

And I find contradicting samples, depending on the ML.net version, but nothing comprehensive.

I have a custom type with one column being the label and the rest being features. Every field is a float32 but the label which is a bool.

I have the following code:

    // Initialize MLContext
    let ctx = MLContext()

    // Load data into IDataView
    let data = ctx.Data.LoadFromEnumerable<ModelInput> (b)

    // Split data into training and validation sets
    let trainValidationData = ctx.Data.TrainTestSplit(data,0.2)

    let header =
        "outcome,side,open0,high0,low0,close0,volume0,open1,high1,low1,close1,volume1,open2,high2,low2,close2,volume2,open3,high3,low3,close3,volume3,open4,high4,low4,close4,volume4,open5,high5,low5,close5,volume5,open6,high6,low6,close6,volume6,open7,high7,low7,close7,volume7,open8,high8,low8,close8,volume8,open9,high9,low9,close9,volume9,open10,high10,low10,close10,volume10,open11,high11,low11,close11,volume11,open12,high12,low12,close12,volume12,open13,high13,low13,close13,volume13,open14,high14,low14,close14,volume14,open15,high15,low15,close15,volume15,open16,high16,low16,close16,volume16,open17,high17,low17,close17,volume17,open18,high18,low18,close18,volume18,open19,high19,low19,close19,volume19,fairPrice,lgH0,lgL0,lgH1,lgL1,obH0,obL0,obH1,obL1,hh0,lh0,hl0,ll0,hh1,lh1,hl1,ll1,bosi0,bose0,coc0,mss0,bosi1,bose1,coc1,mss1"
            .Split ','

    let preprocessingPipeline =
        EstimatorChain()
            .Append(ctx.Transforms.Concatenate("Features", header))

    let autoMLEstimator: SweepablePipeline =
        ctx.Auto().Regression("outcome")

    let toIEstimator (est: 'a) =
        est :> obj :?> IEstimator<ITransformer>

    let pipeline =
         (preprocessingPipeline |> toIEstimator)
             .Append(autoMLEstimator)

    let experiment = ctx.Auto().CreateExperiment()

    experiment
        .SetPipeline(pipeline)
        .SetRegressionMetric(RegressionMetric.RSquared) //, columnInference.ColumnInformation.LabelColumnName)
        .SetTrainingTimeInSeconds(60u)
        .SetDataset(trainValidationData)
        |> ignore

    ctx.Log.Add (fun e -> if (e.Source.Equals("AutoMLExperiment")) then info e.RawMessage)

    let experimentResults = experiment.RunAsync() |> Async.AwaitTask |> Async.RunSynchronously

But it tells me it can't find the 'outcome' column.

The lack of F# examples is really an issue for me. I spent an entire day trying to load data from memory.