Why does featuretools use first column as an index but not the pandas index field?

alteryx / featuretools

An open source python library for automated feature engineering

BSD 3-Clause "New" or "Revised" License

7.25k stars 879 forks source link

@alexandrnikitin to integrate your data with Featuretools, simple call .reset_index() on your dataframe. This will turn the index into a regular column in your dataframe. See the pandas documentation here for an example of how this works.

We treat index in the same way as other columns because it can be used just like any other columns when it comes to feature engineering. For example, we may want to apply the Count primitive to it.

Another more technical reason is that pandas indices can be a little idiomatic (e.g they support things like multiple levels) compared to the concept of a primary key in other tabular data system such as databases. To make our implementation map more generally, we made the design decision to keep it as a normal column.

I'm going to close this for now. If you have further questions on how to get this to work feel free to post on stackoverflow with the featuretools tag.

alteryx / featuretools

Why does featuretools use first column as an index but not the pandas index field? #130