rapidsai / xgboost-conda

Conda recipes for xgboost
12 stars 22 forks source link

E2E.ipynb: dxgb.predict failure #10

Open sfleisch opened 6 years ago

sfleisch commented 6 years ago

E2E.zip I've attached the notebook. Everything runs successfully through the training and a dump of the resulting trees looks reasonable. I modified the DMatrix processing code so that I could create a training set with some of the mortgage data so instead of gpu_dfs I have gpu_train_dfs and gpu_test_dfs. The training works but dask_xgboost.predict fails with: '''

UnboundLocalError Traceback (most recent call last)

in ----> 1 preds=dxgb_gpu.predict(client,bst,gpu_test_dfs) /conda/envs/gdf/lib/python3.5/site-packages/dask_xgboost-0.1.5-py3.5.egg/dask_xgboost/core.py in predict(client, model, data) 301 **kwargs) 302 --> 303 return result 304 305 UnboundLocalError: local variable 'result' referenced before assignment ''' Data prep code: ''' def makeDMatrix(gpu_dfs,client): tmp_map = [(gpu_df, list(client.who_has(gpu_df).values())[0]) for gpu_df in gpu_dfs] new_map = {} for key, value in tmp_map: if value not in new_map: new_map[value] = [key] else: new_map[value].append(key) del(tmp_map) gpu_dfs = [] for list_delayed in new_map.values(): gpu_dfs.append(delayed(pygdf.concat)(list_delayed)) del(new_map) gpu_dfs = [(gpu_df[['delinquency_12']], gpu_df[delayed(list)(gpu_df.columns.difference(['delinquency_12']))]) \ for gpu_df in gpu_dfs] gpu_dfs = [dask.delayed(xgb.DMatrix)(gpu_df[1], gpu_df[0]) for gpu_df in gpu_dfs] gpu_dfs = [gpu_df.persist() for gpu_df in gpu_dfs] return gpu_dfs client.run(initialize_rmm_no_pool) print('part_count ',part_count) gpu_train_dfs = [delayed(DataFrame.from_arrow)(gpu_df) for gpu_df in gpu_dfs[:8]] gpu_test_dfs = [delayed(DataFrame.from_arrow)(gpu_df) for gpu_df in gpu_dfs[8:part_count]] wait([gpu_train_dfs,gpu_test_dfs]) gpu_train_dfs=makeDMatrix(gpu_train_dfs,client) gpu_test_dfs=makeDMatrix(gpu_test_dfs,client) gc.collect() wait([gpu_train_dfs,gpu_test_dfs]) labels = None bst = dxgb_gpu.train(client, dxgb_gpu_params, gpu_train_dfs, labels, num_boost_round=dxgb_gpu_params['nround']) **preds=dxgb_gpu.predict(client,bst,gpu_test_dfs)** '''
swiftdiaries commented 5 years ago

+1 I'm facing the same issue for predict