dotnet / docs

This repository contains .NET Documentation.
https://learn.microsoft.com/dotnet
Creative Commons Attribution 4.0 International
4.29k stars 5.92k forks source link

Inspecting after a transform #12879

Closed famschopman closed 5 years ago

famschopman commented 5 years ago

How do I inspect a dataset after a transform which changes the schema?

var categoricalEstimator = lContext.Transforms.Categorical.OneHotEncoding("StringProperty"); ITransformer categoricalTransformer = categoricalEstimator.Fit(dataView); IDataView transformedData = categoricalTransformer.Transform(dataView);

The following will fail because the schema has been updated with new vector types for the transformed dataset.

IEnumerable<MyModel> employeeDataEnumerable = mlContext.Data.CreateEnumerable<MyModel>(transformedData, reuseRowObject: true);

AB#1530397


Document Details

Do not edit this section. It is required for docs.microsoft.com ➟ GitHub issue linking.

luisquintanilla commented 5 years ago

Hi @famschopman ,

In order to inspect it, you can create a new class that defines your transformed schema. Using your example:

// Original
public class MyModel
{
    public string StringProperty {get;set;}
}

// Transformed
public class MyTransformedModel
{
    public float StringProperty {get;set;}
}

var categoricalEstimator = lContext.Transforms.Categorical.OneHotEncoding("StringProperty");
ITransformer categoricalTransformer = categoricalEstimator.Fit(dataView);
IDataView transformedData = categoricalTransformer.Transform(dataView);

IEnumerable<MyTransformedModel> employeeDataEnumerable = mlContext.Data.CreateEnumerable<MyTransformedModel>(transformedData, reuseRowObject: true);

Notice how the type of your StringProperty has changed from string to float because now that column is the result of applying the OneHotEncoding transform.

JRAlexander commented 5 years ago

Hi, @famschopman! This appears resolved so we're going to close. Please reopen or comment here if that isn't the case. Thanks!