alteryx / featuretools

An open source python library for automated feature engineering
https://www.featuretools.com
BSD 3-Clause "New" or "Revised" License
7.2k stars 871 forks source link

distributed.worker - WARNING - Could not find data #2733

Open TonyHuBD opened 4 months ago

TonyHuBD commented 4 months ago

Code Sample, a copy-pastable example to reproduce your bug.

feature_matrix, feature_defs = ft.dfs(
    entityset=es, 
    target_dataframe_name="acc",
    agg_primitives=["count", "sum"],
    trans_primitives=[ "MultiplyNumericBoolean"],
    cutoff_time=cutoff_times,
    cutoff_time_in_index=True,
    training_window="24 hour",
    max_depth=2,
    verbose=True,
    n_jobs = 36

)

Warning message

2024-05-18 18:25:09,703 - distributed.worker - WARNING - Could not find da ta: {'bytes-e7e617a37c90e401634a701acdeac78d': ['tcp://127.0.0.1:41237', ' tcp://127.0.0.1:43061', 'tcp://127.0.0.1:46849', 'tcp://127.0.0.1:40099', 'tcp://127.0.0.1:46353']} on workers: [] (who_has: {'EntitySet-878e4a0191d 9aef3e784de699c988216': ['tcp://127.0.0.1:41237', 'tcp://127.0.0.1:43061', 'tcp://127.0.0.1:46493', 'tcp://127.0.0.1:40099', 'tcp://127.0.0.1:46353' ], 'bytes-e7e617a37c90e401634a701acdeac78d': ['tcp://127.0.0.1:41237', 'tc p://127.0.0.1:43061', 'tcp://127.0.0.1:46849', 'tcp://127.0.0.1:40099', 't cp://127.0.0.1:46353']}) 2024-05-18 18:25:09,704 - distributed.scheduler - WARNING - Worker tcp://1 27.0.0.1:45941 failed to acquire keys: {'bytes-e7e617a37c90e401634a701acde ac78d': ('tcp://127.0.0.1:41237', 'tcp://127.0.0.1:43061', 'tcp://127.0.0. 1:46849', 'tcp://127.0.0.1:40099', 'tcp://127.0.0.1:46353')} 2024-05-18 18:28:00,306 - distributed.worker - WARNING - Could not find da ta: {'bytes-e7e617a37c90e401634a701acdeac78d': ['tcp://127.0.0.1:45941', ' tcp://127.0.0.1:41237', 'tcp://127.0.0.1:46849', 'tcp://127.0.0.1:46353']} on workers: [] (who_has: {'bytes-e7e617a37c90e401634a701acdeac78d': ['tcp ://127.0.0.1:45941', 'tcp://127.0.0.1:41237', 'tcp://127.0.0.1:46849', 'tc p://127.0.0.1:46353']})

I don't know why. I can't get the result when I use n_jobs.

thehomebrewnerd commented 3 months ago

@TonyHuBD n_jobs seems to be working as expected for me. Unfortunately, the information above isn't detailed enough to know what might be going wrong. If you are not using the most recent versions of Featuretools or Dask, my first suggestion would be to try to upgrade to the latests released versions and try again.