dotnet / machinelearning

ML.NET is an open source and cross-platform machine learning framework for .NET.
https://dot.net/ml
MIT License
9.05k stars 1.88k forks source link

Can no longer subclass DataFrameColumn due to internal abstract GetSortIndices method #7323

Open charliebone opened 1 day ago

charliebone commented 1 day ago

Hello,

I'm glad v0.22 is published with some fixes I have been waiting for. However, there is now an issue.

I inherit from the DataFrameColumn class to build my own DataFrameColumn implementation. The reasons aren't super important but it's used allow for some implementations of extension methods on the DataFrameColumn class that do computations that require certain data types. The class is a public abstract class, which indicated to me as a user of the ml.net library it was fair game to override and to continue writing extension methods that all were based on the DataFrameColumn abstract base class.

Now, we have an internal abstract method present in the base class, making overriding impossible even though every other abstract method and property is either public or protected: https://github.com/dotnet/machinelearning/blob/d4bc05db154b1e3c8513e52b9c476f83b49d4048/src/Microsoft.Data.Analysis/DataFrameColumn.cs#L463

Is there anything that can be done to remedy this?

asmirnov82 commented 1 day ago

Hi @charliebone. I implemented fix for this. You may try to inherit from the PrimitiveDataFrameColumn class as a workaround, if it's possible in your scenario (though T should be a value type)

charliebone commented 1 day ago

Hi @asmirnov82, thank you for the prompt reply and fix! Any idea when it'll make it out into the wild?

I have considered inheriting from PrimitiveDataFrame instead, and perhaps I'll go that route in the interim. Thanks again for the fast fix!