Closed WeichenXu123 closed 2 weeks ago
LGTM for the functionality except the CI issue.
Could you run python tests/ci_build/lint_python.py --format=1 --type-check=1 --pylint=1
to check the python format
I can't fully understand the linter error:
xgboost/spark/core.py:1162: error: Incompatible types in assignment (expression has type "str", variable has type "Booster") [assignment]
xgboost/spark/core.py:1164: error: Argument 1 to "len" has incompatible type "Booster"; expected "Sized" [arg-type]
Found 2 errors in 1 file (checked 41 source files)
@wbo4958 any ideas ?
@WeichenXu123 XGBoost's Python package uses Python typehint. In the following line:
booster = booster.save_raw("json").decode("utf-8")
The booster
was a xgboost.Booster
object, the decode("utf-8")
however, returns a string. Assigning a string to a Booster
type violates static typing.
LGTM if the CI can pass
@trivialfis Can we make a patch release to include this fix ? We have several customers facing the issue. thanks!
@WeichenXu123 https://github.com/dmlc/xgboost/issues/10992 .
Spark RDD can't support one line with very long content.
To make large size model training / saving / loading works, I split model json string to chunks when collecting model in training, and modify saving / loading code too.