elephaint / pgbm

Probabilistic Gradient Boosting Machines
Apache License 2.0
138 stars 20 forks source link

An error with PGBM #7

Closed flippercy closed 2 years ago

flippercy commented 2 years ago

Hi @elephaint:

I got the following error when using the sklearn wrapper, PGBMRegressor:

~/.local/lib/python3.7/site-packages/pgbm/pgbm.py in _predict_tree(self, X, mu, variance, estimator) 401 # Choose next node (based on breadth-first) 402 condition = (nodes_predict >= node) * (predictions == 0) --> 403 node = nodes_predict[condition].min() 404 # Select current node information 405 split_node = nodes_predict == node

RuntimeError: min(): Expected reduction dim to be specified for input.numel() == 0. Specify the reduction dim with the 'dim' argument.

Any insight? It is not due to the data because I used the same data to build a PGBM model successfully before. Is it due to some hyperparameters? I got this error when trying to do HPO for PGBM using FLAML (https://github.com/microsoft/FLAML) and the search space I used is:

'max_bin': {'domain': tune.loguniform(lower=32, upper=32767), 'init_value': 256, 'low_cost_init_value': 256}, 'max_leaves': {'domain': tune.uniform(lower=16, upper=128), 'init_value': 64}, 'n_estimators': {'domain': tune.uniform(lower = 50, upper = 500), 'init_value': 200, 'low_cost_init_value': 200}, 'min_data_in_leaf': {'domain': tune.uniform(lower = 1, upper = 1000), 'init_value': 100, 'low_cost_init_value': 100}, 'bagging_fraction': {'domain': tune.uniform(lower = 0.6, upper = 1), 'init_value': 0.7, 'low_cost_init_value': 0.7}, 'feature_fraction': {'domain': tune.uniform(lower = 0.5, upper = 1), 'init_value': 0.9, 'low_cost_init_value': 0.9}, 'learning_rate': {'domain': tune.loguniform(lower = 0.001, upper = 1), 'init_value': 0.1, 'low_cost_init_value': 0.1}, 'min_split_gain': {'domain': tune.loguniform(lower = 0.000000000001, upper = 0.001), 'init_value': 0.00001, 'low_cost_init_value': 0.00001},

Thank you.

elephaint commented 2 years ago

Hi,

It looks like a genuine bug that I'd have to reproduce. Most likely, hyperparameter combination may cause this. I'd suggest min_data_in_leaf > 2 always, but that does not necessarily have to relate to this issue. If you can share the script and data that I'd be happy to delve further into it next week. Sorry to not be able to help you further at this moment!

In the sklearn wrapper, you should use reg_lambda instead of lambda (see here: https://github.com/elephaint/pgbm/blob/65b6d0331a70520ea8b545cbd6196133c4a7c11b/src/pgbm/pgbm.py#L1179) Sorry for the incovenience, but otherwise there are issues with Python's lambda function.

Random_seed is random_state in the sklearn wrapper to adhere to sklearn naming conventions (see here: https://github.com/elephaint/pgbm/blob/65b6d0331a70520ea8b545cbd6196133c4a7c11b/src/pgbm/pgbm.py#L1182)

Best,

Olivier


From: flippercy @.> Sent: Wednesday, September 22, 2021 4:44:17 PM To: elephaint/pgbm @.> Cc: Olivier Sprangers @.>; Mention @.> Subject: [elephaint/pgbm] An error with PGBM (#7)

Hi @elephainthttps://github.com/elephaint:

I got the following error when using the sklearn wrapper, PGBMRegressor:

~/.local/lib/python3.7/site-packages/pgbm/pgbm.py in _predict_tree(self, X, mu, variance, estimator) 401 # Choose next node (based on breadth-first) 402 condition = (nodes_predict >= node) * (predictions == 0) --> 403 node = nodes_predict[condition].min() 404 # Select current node information 405 split_node = nodes_predict == node

RuntimeError: min(): Expected reduction dim to be specified for input.numel() == 0. Specify the reduction dim with the 'dim' argument.

It is not due to the data because I used the same data to build a PGBM model successfully before. Is it due to some hyperparameters? I got this error when trying to do HPO for PGBM using FLAML (https://github.com/microsoft/FLAML) and the search space I used is:

'max_bin': {'domain': tune.loguniform(lower=32, upper=32767), 'init_value': 256, 'low_cost_init_value': 256}, 'max_leaves': {'domain': tune.uniform(lower=16, upper=128), 'init_value': 64}, 'n_estimators': {'domain': tune.uniform(lower = 50, upper = 500), 'init_value': 200, 'low_cost_init_value': 200}, 'min_data_in_leaf': {'domain': tune.uniform(lower = 1, upper = 1000), 'init_value': 100, 'low_cost_init_value': 100}, 'bagging_fraction': {'domain': tune.uniform(lower = 0.6, upper = 1), 'init_value': 0.7, 'low_cost_init_value': 0.7}, 'feature_fraction': {'domain': tune.uniform(lower = 0.5, upper = 1), 'init_value': 0.9, 'low_cost_init_value': 0.9}, 'learning_rate': {'domain': tune.loguniform(lower = 0.001, upper = 1), 'init_value': 0.1, 'low_cost_init_value': 0.1}, 'min_split_gain': {'domain': tune.loguniform(lower = 0.000000000001, upper = 0.001), 'init_value': 0.00001, 'low_cost_init_value': 0.00001},

By the way, it seems that 'lambda' and 'seed' are missing in the sklearn wrapper.

Thank you.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHubhttps://github.com/elephaint/pgbm/issues/7, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AKYHS4FV4QA3NLXUX2AA6O3UDHTUDANCNFSM5ERP5AJA. Triage notifications on the go with GitHub Mobile for iOShttps://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Androidhttps://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.

flippercy commented 2 years ago

@elephaint:

Thank you for the reply; your guess is correct - after I excluded min_data_in_leaf from the search space it ran well.

What is the ideal range of this parameter? In your example it was set as 1. I got the error aforementioned with it equals 100 and my dataset has 40k+ records.

elephaint commented 2 years ago

It mostly depends on your dataset size and higher is usually used to prevent overfitting. For problems with less than 10k samples, I'd probably use in the range of 2-10, whereas >100 is more applicable for problems with samples >100k. Min_data_in_leaf should be set in conjunction with max_leaves: high max_leaves can lead to overfitting if min_data_in_leaf is small. If you build shallow trees (low max_leaves), then a relatively low min_data_in_leaf probably works fine.

Hope this helps.

Olivier


From: flippercy @.> Sent: Thursday, September 23, 2021 10:51:18 PM To: elephaint/pgbm @.> Cc: Olivier Sprangers @.>; Mention @.> Subject: Re: [elephaint/pgbm] An error with PGBM (#7)

@elephainthttps://emea01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Felephaint&data=04%7C01%7C%7C5a6678ba02214381c0fc08d97ed3e601%7C84df9e7fe9f640afb435aaaaaaaaaaaa%7C1%7C0%7C637680270841455805%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=UGiL2%2FNmE6MjtQ9DLaLSyT2lug0s4GbwSqJIAUnYp%2BM%3D&reserved=0:

Thank you for the reply; your guess is correct - after I excluded min_data_in_leaf from the search space it ran well.

What is the ideal range of this parameter? In your example it was set as 1. I got the error aforementioned with it equals 100.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHubhttps://emea01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Felephaint%2Fpgbm%2Fissues%2F7%23issuecomment-926143882&data=04%7C01%7C%7C5a6678ba02214381c0fc08d97ed3e601%7C84df9e7fe9f640afb435aaaaaaaaaaaa%7C1%7C0%7C637680270841465766%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=jvXrOIqctV8BaFQtX6c3yKH0mgvVZTX%2FtDKK54n6kyw%3D&reserved=0, or unsubscribehttps://emea01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fnotifications%2Funsubscribe-auth%2FAKYHS4EUNSVV3DCQFLYR4CLUDOHMNANCNFSM5ERP5AJA&data=04%7C01%7C%7C5a6678ba02214381c0fc08d97ed3e601%7C84df9e7fe9f640afb435aaaaaaaaaaaa%7C1%7C0%7C637680270841465766%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=ZJyBFAJWZoB1YCVyELF1sWGX6jh8nXyWB4wb7%2BzPG6M%3D&reserved=0. Triage notifications on the go with GitHub Mobile for iOShttps://emea01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fapps.apple.com%2Fapp%2Fapple-store%2Fid1477376905%3Fct%3Dnotification-email%26mt%3D8%26pt%3D524675&data=04%7C01%7C%7C5a6678ba02214381c0fc08d97ed3e601%7C84df9e7fe9f640afb435aaaaaaaaaaaa%7C1%7C0%7C637680270841475716%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=SpqFbVM7q%2F2VeZl%2BROFXrmekxtiOLjher9lV6FPZQnQ%3D&reserved=0 or Androidhttps://emea01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fplay.google.com%2Fstore%2Fapps%2Fdetails%3Fid%3Dcom.github.android%26referrer%3Dutm_campaign%253Dnotification-email%2526utm_medium%253Demail%2526utm_source%253Dgithub&data=04%7C01%7C%7C5a6678ba02214381c0fc08d97ed3e601%7C84df9e7fe9f640afb435aaaaaaaaaaaa%7C1%7C0%7C637680270841475716%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=zF5D4T7CeP7jflqN3s0MfxV%2FTLeUvud6J2cMF0EDsM4%3D&reserved=0.

flippercy commented 2 years ago

Thank you for the insights! However, with my dataset (40k+ records), do you know why PGBM encountered the error I got above when min_data_in_leaf = 100? It seems that nothing returned for nodes_predict[condition].

Best,

elephaint commented 2 years ago

Hi,

I don't know yet. I'll try to reproduce today.

elephaint commented 2 years ago

Hi,

Sorry for the late reply. So I have been trying to reproduce, but I can't reproduce, unfortunately. I did completely rewrite the code for the PyTorch version (for speedup reasons but also to make it more robust, where possible), but I feel it needs a bit more checks before I am ready to push it. I'd be eager to find out if your problem still persists with the next version, as I did rewrite some of the tree building code that might produce the error you stumbled upon. I should be able to push that somewhere in the next 7-14 days.

elephaint commented 2 years ago

Hi,

I've released a new version (1.4) that should solve the issues you faced; the code has been completely reworked and the bugs you experienced should not be possible anymore.

flippercy commented 2 years ago

Hi @elephaint:

Thanks a lot! I will update the library and check.

Best,