alteryx / featuretools

An open source python library for automated feature engineering
https://www.featuretools.com
BSD 3-Clause "New" or "Revised" License
7.25k stars 879 forks source link

Null values for cutoff time #132

Closed dylancsumner closed 6 years ago

dylancsumner commented 6 years ago

Sometimes we may have null values for the cutoff time. Consider the case where we are trying to generate features to predict whether or not we will sell a product to a customer. We want to exclude all data after the sale. So for customer we sold to, there is one point in time where they became a 'sold to' customer. But for customers we did not sell to, there is no one single point where they became 'not a sale'. In this case, we want to include all the data. We could represent this with a null value in the cutoff time df.

Forgive me if I am missing something, but I am not aware of a way to do this currently in featuretools? Null values in the cutoff time will throw an error?

Seth-Rothschild commented 6 years ago

Hi @dylancsumner,

That's a great question! We're reorganizing our Q&A to move "How do I..." type questions over to Stack Overflow with the featuretools tag. Would you mind asking this there?