microsoft / LightGBM

A fast, distributed, high performance gradient boosting (GBT, GBDT, GBRT, GBM or MART) framework based on decision tree algorithms, used for ranking, classification and many other machine learning tasks.
https://lightgbm.readthedocs.io/en/latest/
MIT License
16.56k stars 3.82k forks source link

[R-package] Add support for non-ASCII feature names #2983

Closed jameslamb closed 3 years ago

jameslamb commented 4 years ago

Summary

See #2976 for background. That PR adds support for non-ASCII features back to the C++ and Python libraries. A test in the R package was added in that PR but was breaking for reasons that weren't obvious, see for example https://github.com/microsoft/LightGBM/pull/2976#issuecomment-609905640.

I think that closing this issue will mean changing lgb.encode.char(). You'll know that a f ix is working if you remove the testthat::skip() call added in #2976 .

jameslamb commented 4 years ago

Closed in favor of being in #2302. We decided to keep all feature requests in one place.

Welcome to contribute this feature! Please re-open this issue (or post a comment if you are not a topic starter) if you are actively working on implementing this feature.

jameslamb commented 3 years ago

@StrikerRUS Even though Hacktoberfest is over, I still think that good first issue issues should be kept open actually, so that they're discoverable for new contributors. I think hiding them in #2302 makes them hard to find.

It's very common practice on GitHub for would-be new contributors to go to a repository and expect to be able to filter the open issues by good first issue.

StrikerRUS commented 3 years ago

@jameslamb I'm OK with that. But please add some words about that in #2302 opening message then.