Testing data evaluation code

Hello! First of all, thank you for sharing your code, it has been immensely helpful in experimenting with differential privacy in gradient boosting trees. I'd like to ask if it would be possible for the authors to share the code that was used to evaluate the model on the test sets. I'm trying to use this code as a baseline (and reproduce the same test scores as the paper) for some experiments I'm conducting. I've tried using lgb.train followed by lgb.predict to evaluate the test set, but no matter what changes I apply (change of the number of trees to 10 as suggested in a previous issue, change of budget...) I still get the same scores. This is the code that I used in run_exp.py (inspired by what I've seen in a previous issue):

    model = lgb.train(params, data,  num_boost_round = n_trees)
    X_test, y_test = fetch_libsvm("a9a_test")

    y_pred_scaled = model.predict(X_test)
    # print("error mean:", results["binary_error-mean"][n_trees - 1])
    # print("error std:", results["binary_error-stdv"][n_trees - 1])

    y_pred = np.where(y_pred_scaled > 0, 1, -1)

    print(f"Test Error = {accuracy_score(y_pred, y_test)}")

Thanks in advance for the help! @PintOfBitter

QinbinLi / DPBoost

Testing data evaluation code #7