alteryx / featuretools

An open source python library for automated feature engineering
https://www.featuretools.com
BSD 3-Clause "New" or "Revised" License
7.26k stars 879 forks source link

Expand guide on using ColumnSchemas in creating custom primitives #1637

Open tamargrey opened 3 years ago

tamargrey commented 3 years ago

While converting Primitives to use Woodwork for their input and return types, there seems to be some common optimizations that can/should be used to define the best input and return types for a primitives.

There isn't much discussion of how a user can ensure that they're following these principles. In the woodwork in featuretools guide, the concept of how ColumnSchemas get used as input and return types is explained, but the Feature Primitives doc, might be the best place to explain how to best use ColumnSchema objects.

The tips I can think of right now are:

gsheni commented 3 years ago

@tamargrey I agree that an in-depth guide to writing primitives with ColumnSchema will be useful.

We can prioritize this after we release Featuretools v1.0.0

tamargrey commented 3 years ago

It'd also be useful if this section explained how to define ColumnSchema objects for the return_types parameter in dfs. We'd want to avoid users specifying column schemas that are too restrictive or not restrictive enough or redundant, so we'll want to be really clear about how column schemas in return_types will get used