Closed alexandrnikitin closed 6 years ago
@alexandrnikitin to integrate your data with Featuretools, simple call .reset_index()
on your dataframe. This will turn the index into a regular column in your dataframe. See the pandas documentation here for an example of how this works.
We treat index in the same way as other columns because it can be used just like any other columns when it comes to feature engineering. For example, we may want to apply the Count
primitive to it.
Another more technical reason is that pandas indices can be a little idiomatic (e.g they support things like multiple levels) compared to the concept of a primary key in other tabular data system such as databases. To make our implementation map more generally, we made the design decision to keep it as a normal column.
I'm going to close this for now. If you have further questions on how to get this to work feel free to post on stackoverflow with the featuretools
tag.
Hi,
Pandas creates an implicit index if one isn't specified as a column. What I want to achieve is to use the pandas' index in
featuretools
but it can't be passed as a name inindex
argument.featuretools
uses first column by default and that part is not clear to me. Why doesfeaturetools
use first column as an index but not the pandas index field? How to letfeaturetools
use the index field instead?The code: https://github.com/Featuretools/featuretools/blob/906777bbafc18892a927dfdc5ac3f3b8d40de1b5/featuretools/entityset/entityset.py#L441-L459