JoaquinAmatRodrigo / skforecast

Time series forecasting with machine learning models
https://skforecast.org
BSD 3-Clause "New" or "Revised" License
992 stars 113 forks source link

i have question about lightgbm GPU #729

Open hwoarang09 opened 1 week ago

hwoarang09 commented 1 week ago

i'm studying with you website, https://cienciadedatos.net/documentos/py53-global-forecasting-models.html

there are more than 1000 buildings.

when i resample building with

==============================================================================

end_train = '2016-07-31 23:59:00' end_validation = '2016-09-30 23:59:00' data_train = data.loc[: end_train, :].copy() data_val = data.loc[end_train:end_validation, :].copy() data_test = data.loc[end_validation:, :].copy()

Sample 600 buildings for GPU

==============================================================================

rng = np.random.default_rng(12345) buildings = data['building_id'].unique() buildings_selected = rng.choice( buildings, size = 254, replace = False )

size less than 255 it works.

but if size over 255,

error occur LightGBMError: bin size 257 cannot run on GPU

so i have to resample less than 255...

but i want my model learn all buildings.

Can you help me...?? Thank you!!

JoaquinAmatRodrigo commented 1 week ago

Hi @hwoarang09, It seems that lightgbm only allows a maximum value of 255 bins when working with categorical features. In skforecast >= 0.13.0 we introduced an encoding argument. Could you try using encoding='ordinal'?

You can see an example here: https://skforecast.org/0.12.1/user_guides/independent-multi-time-series-forecasting#series-encoding-in-multi-series

forecaster = ForecasterAutoregMultiSeries(
                   regressor = RandomForestRegressor(random_state=123),
                   lags      = 3,
                   encoding  = 'ordinal'
               )