alteryx / featuretools

An open source python library for automated feature engineering
https://www.featuretools.com
BSD 3-Clause "New" or "Revised" License
7.25k stars 879 forks source link

ValueError: mismatch in size of old and new data-descriptor #232

Closed hoyeunglee closed 6 years ago

hoyeunglee commented 6 years ago

how to solve ValueError: mismatch in size of old and new data-descriptor ?

df_date, df_open, df_high, df_low, df_close, df_volume = df[["Date"]], df[["Open","Date"]], df[["High","Date"]],df[["Low","Date"]],df[["Close","Date"]],df[["Volume","Date"]]

esDate = ft.EntitySet("DateSet") esOpen = ft.EntitySet("OpenSet") esHigh = ft.EntitySet("HighSet") esLow = ft.EntitySet("LowSet") esClose = ft.EntitySet("CloseSet") esVolume = ft.EntitySet("VolumeSet")

df_date.index.name = "DateID" df_date = df_date.reset_index() df_open.index.name = "DateID" df_open = df_open.reset_index() df_high.index.name = "DateID" df_high = df_high.reset_index() df_low.index.name = "DateID" df_low = df_low.reset_index() df_close.index.name = "DateID" df_close = df_close.reset_index() df_volume.index.name = "DateID" df_volume = df_volume.reset_index()

es = ft.EntitySet("stock")

esDate.entity_from_dataframe(entity_id="DateEntity", dataframe=df_date, index="DateID", time_index="Date") esOpen.entity_from_dataframe(entity_id="OpenEntity", dataframe=df_open, index="DateID", time_index="Date") esHigh.entity_from_dataframe(entity_id="HighEntity", dataframe=df_high, index="DateID", time_index="Date") esLow.entity_from_dataframe(entity_id="LowEntity", dataframe=df_low, index="DateID", time_index="Date") esClose.entity_from_dataframe(entity_id="CloseEntity", dataframe=df_close, index="DateID", time_index="Date") esVolume.entity_from_dataframe(entity_id="VolumeEntity", dataframe=df_volume, index="DateID", time_index="Date")

entities = {"datee" : (df_date, "DateID","Date"), "opene" : (df_open, "DateID", "Open"), "highe" : (df_high, "DateID", "High"), "lowe" : (df_low, "DateID", "Low"), "closee" : (df_close, "DateID", "Close"), "volumee" : (df_volume, "DateID", "Volume")}

relationships = [("datee", "DateID", "opene", "DateID"),("datee", "DateID", "highe", "DateID"),("datee", "DateID", "lowe", "DateID"),("datee", "DateID", "closee", "DateID"),("datee", "DateID", "volumee", "DateID")]

feature_matrix, features = ft.dfs(entities=entities,relationships=relationships,target_entity="closee")

kmax12 commented 6 years ago

thanks for trying featuretools. in order for us to help, we need more information to reproduce. can you do these 3 things

  1. post the complete stacktrace, as well as the line where it occurs
  2. Remove any code that is not necessary to reproduce the error.
  3. If possible, share the data? you can also email it to us at help@featuretools.com
hoyeunglee commented 6 years ago

i sent to help@featuretools.com

kmax12 commented 6 years ago

Closing for now