ShifuML / shifu

An end-to-end machine learning and data mining framework on Hadoop
https://github.com/ShifuML/shifu/wiki
Apache License 2.0
251 stars 109 forks source link

OOM when ColumnConfig is very large #738

Closed Liu-Delin closed 3 years ago

Liu-Delin commented 3 years ago

We tested a very large column config (270MB) which contains 17K columns. It costs 1.3GB memory: image

Within these column configs, some are very large. It is due to large binCateMap. image

This is the error message: image

Unfortunately, each UDF instance will try to load column config. So we need to fix it to share column config among different UDFs.

Liu-Delin commented 3 years ago

https://github.com/ShifuML/shifu/pull/740