Closed cgy-dayup closed 2 years ago
when i want to test the demo using my own dataset,i come across the problem?
Are you loading the dataset in a distributed fashion? Can you show the entire script?
Are you loading the dataset in a distributed fashion? Can you show the entire script?
no,i am not.i load the csv from local.Below is the full script:
y_train=df_train['gena_sale_qtty']
y_test=df_dev['gena_sale_qtty']
train_data = lgb.Dataset(x_train, label=y_train)
test_data = lgb.Dataset(x_test, label=y_test)
cat_feature=['big_sale_flag','is_weekday','day_week','week_num',
'mon_num','day_mon','region']
def traindata(config):
gbm = lgb.train(
config,
train_set=train_data,
valid_sets=[test_data],
valid_names=["eval"],
verbose_eval=False,
categorical_feature=cat_feature
callbacks=[
TuneReportCheckpointCallback({
"l1": "eval-l1"
})
])
y_pre=gbm.predict(x_test)
res=sklearn.metrics.mean_absolute_error(y_pre,y_test)
tune.report(mean_accuracy=res, done=True)
if __name__ == "__main__":
import argparse
parser = argparse.ArgumentParser()
parser.add_argument(
"--server-address",
type=str,
default=None,
required=False,
help="The address of server to connect to if using "
"Ray Client.")
args, _ = parser.parse_known_args()
if args.server_address:
import ray
ray.init(f"ray://{args.server_address}")
config = {
"boosting_type": "gbdt",
'objective': 'regression',
"metric": "l1",
"verbose": -1,
"learning_rate": tune.grid_search([1e-4, 1e-3, 1e-2]),
"num_boost_round": tune.grid_search([300, 500, 1000]),
"colsample_bytree": tune.grid_search([0.7, 0.8,1]),
"max_depth": tune.grid_search([5, 6, 7]),
}
analysis = tune.run(
traindata,
metric="l1",
mode="min",
config=config,
num_samples=10,
scheduler=ASHAScheduler(max_t=200)
)
print("Best hyperparameters found were: ", analysis.best_config)
In general, you should not be using non-local variables with Ray, and for Tune, data should be passed through tune.with_parameters
. Try this:
y_train=df_train['gena_sale_qtty']
y_test=df_dev['gena_sale_qtty']
cat_feature=['big_sale_flag','is_weekday','day_week','week_num',
'mon_num','day_mon','region']
def traindata(config, x_train, x_test, y_train, y_test):
train_data = lgb.Dataset(x_train, label=y_train)
test_data = lgb.Dataset(x_test, label=y_test)
gbm = lgb.train(
config,
train_set=train_data,
valid_sets=[test_data],
valid_names=["eval"],
verbose_eval=False,
categorical_feature=cat_feature
callbacks=[
TuneReportCheckpointCallback({
"l1": "eval-l1"
})
])
y_pre=gbm.predict(x_test)
res=sklearn.metrics.mean_absolute_error(y_pre,y_test)
tune.report(mean_accuracy=res, done=True)
if __name__ == "__main__":
import argparse
parser = argparse.ArgumentParser()
parser.add_argument(
"--server-address",
type=str,
default=None,
required=False,
help="The address of server to connect to if using "
"Ray Client.")
args, _ = parser.parse_known_args()
if args.server_address:
import ray
ray.init(f"ray://{args.server_address}")
config = {
"boosting_type": "gbdt",
'objective': 'regression',
"metric": "l1",
"verbose": -1,
"learning_rate": tune.grid_search([1e-4, 1e-3, 1e-2]),
"num_boost_round": tune.grid_search([300, 500, 1000]),
"colsample_bytree": tune.grid_search([0.7, 0.8,1]),
"max_depth": tune.grid_search([5, 6, 7]),
}
analysis = tune.run(
tune.with_parameters(traindata, x_train=x_train, x_test=x_test, y_train=y_train, y_test=y_test),
metric="l1",
mode="min",
config=config,
num_samples=10,
scheduler=ASHAScheduler(max_t=200)
)
print("Best hyperparameters found were: ", analysis.best_config)
If this doesn't work, consider using LightGBM-Ray (you are using regular LightGBM) and pass data through a Ray Dataset.
thank you! i solved the problem after using your method.And i also have a question: what's the difference between 'tune.report' and 'TuneReportCheckpointCallback'?i find the ‘ mean_accuracy‘ reported by ’tune.report‘ is not reflected anywhere.
_InactiveRpcError Traceback (most recent call last)