Closed dje-dev closed 3 years ago
As suggested by borg, vram requirements can be reduced by reducing the backend's max batch size, probably something like: -- backend-opts=(backend=cuda,gpu=0,max_batch=512). The engine then needs to also understand not to build batches in excess of this size. Probably a user bool setting like "small-memory" could be introduced which would cut various parameters in half such as this batch size and also other internal data structures. Possibly set this automatically if the GPU has little RAM.
Ceres v0.90-rc1 (just released) significantly reduces GPU memory consumption, and is now nearly identical to that of LC0.
Ceres switches into a different mode at the 5000 node threshold, and a second backend session is created, doubling memory requirements (beyond that of LC0). On small GPUs with big networks this may result in a crash. At a minimum, error reporting should be improved.