Open baruchiro opened 5 years ago
I am having the same problem. 1.With the 1st solution I get an error as like this as if the transformer is not working. 2 With the prefeatureizer I am still getting the same error again as if the internal transformer is not working and when I check the object it seems that the append is not working as I only have the drop column transform in there. This is very frustrating. here is my prefeatureizer but when I test it I am only ever able to add the DropColumn
private IEstimator
IEstimator<ITransformer> preFeatureizer = _mlContext.Transforms.DropColumns(_modelInput.KeyFeatureToIgnore);
foreach (string feature in _includedFeatureNames)
{
propertyInfo = _allFeaturesPropertyInfo.Find(x => x.Name == feature);
if (typeof(Double) == propertyInfo.PropertyType)
{
preFeatureizer.Append(_mlContext.Transforms.Conversion.ConvertType(feature, feature, DataKind.Single));
}
preFeatureizer.Append(_mlContext.Transforms.NormalizeMeanVariance(feature, useCdf: false));
}
preFeatureizer.AppendCacheCheckpoint(_mlContext);
return (preFeatureizer);
}
@baruchiro: Quite right. We should check the datatype after the pre-featurizer is applied.
Another possible route is automatic conversion from long
to Single
within AutoML. This route would take some thought, as this can be a little bit dangerous to do automatically as the mapping is only 1-to-1 when within ± 2^24+1
. For instance, this would negatively affect someone forecasting the sales of products using a UPC/EAN number as the conversion would be lossy.
So you are going to do a fix?
System information
Microsoft.ML.AutoML (0.14.0)
Issue
TL;DR:
ExperimentBase.Execute
with non-nullpreFeaturizer
argument should check the schema types after transforming thepreFeaturizer
.I have a flattened object with
long
andint
fields that I load it to anIDataView
. If I want toExecute
an experiment for thisdataView
, I get this expeption:So, I have to create an
EstimatorChain
toConvertType
fromlong
toSingle
, andFit
thenTransform
thedataView
.Let's say I have this
EstimatorChain
to transform these types. Now I have two options:dataView
before passing it to theExecute
method.With this option, the problem is that I have to create a class that fit to the new schema, if I want to save the model and use it latter.
(The first generic type in
CreatePredictionEngine
must be appropriated to theinputSchema
)EstimatorChain
aspreFeaturizer
argument in theExecute
method.But this is not a real solution because the
Execute
method still throws the exception above!