crate / docker-crate

Source repository for the official CrateDB Docker image
https://registry.hub.docker.com/_/crate/
Apache License 2.0
50 stars 22 forks source link

Increase default value for `CRATE_HEAP_SIZE` from 512M to 2G #211

Closed amotl closed 10 months ago

amotl commented 2 years ago

Hi there,

the default setting of 512M for CRATE_HEAP_SIZE was defined four years ago. I think it is fine to increase that value to 2G today, in order to provide users a better default experience.

On the other hand, I will also be happy to learn if you think it is a bad idea.

With kind regards, Andreas.

Reference: #208

matriv commented 2 years ago

imho it makes sense.

robd003 commented 2 years ago

What about using 25% of the memory allowed as suggested in the docs? Could we read from /sys/fs/cgroup/memory.max and use 25% of that?

It looks like this is only set if Kubernetes or Docker set a max memory value. Otherwise you'll get the string max

Setting limit via docker-compose.yml

services:
  crate:
    restart: unless-stopped
    image: crate:latest
    ports:
      - "4200:4200"
      - "5442:5432"
    volumes:
      - crate_data:/data
    deploy:
      resources:
        limits:
          memory: 2048M
mfussenegger commented 2 years ago

What about using 25% of the memory allowed as suggested in the docs? Could we read from /sys/fs/cgroup/memory.max and use 25% of that?

I think it would be technically possible, but it would have the same problem: The recommendation for a production system is different from what you'd want in a local development environment. We have to choose one over the other. Currently the defaults target the development environment. This is also the case for the tarball.

The reason we went with the development case is that most people who try out CrateDB for the first time will likely do so on a developer machine (or on cloud)

For production it is quite common that some configuration tuning is required either way. You likely already have to configure the networking. Given that you have to touch some options it is not that much of a barrier to touch some more. E.g. even if we went with the 25% - that could be the wrong value if CrateDB is not the only process running on a server. Or if you have a query-pattern that is heap intensive it could also be too low.

So the first question would be if we should change our default target to production systems instead of development systems. I don't see a big benefit in doing that.

amotl commented 2 years ago

Thank you for providing an elaborate rationale for choosing a lower value here. Do you think it would be beneficial to increase the value to 1G instead, even for development purposes where users may exercise some "more advanced" query patterns which would otherwise trigger OOM errors or circuit breaker mechanisms?

mfussenegger commented 2 years ago

What I think we should do is highlight in the documentation that users may want to change the value.

amotl commented 2 years ago

What I think we should do is highlight in the documentation that users may want to change the value.

All right, I will submit a patch to improve the documentation section at [1] correspondingly, dropping a word about the default value (512 MB) how CrateDB Docker images are shipped, and why the user would want to increase it.

[1] currently recommends to set CRATE_HEAP_SIZE typically half the container’s memory limit. On the other hand, [2] recommends A good starting point for the heap space is 25% of the systems memory. Which statement is correct?

[1] https://crate.io/docs/crate/tutorials/en/latest/containers/docker.html#memory [2] https://crate.io/docs/crate/howtos/en/latest/performance/memory.html#memory-configuration

surister commented 10 months ago

Due to https://github.com/crate/docker-crate/issues/208#issuecomment-1883314254 can we close this?

seut commented 10 months ago

Due to #208 (comment) can we close this?

Yes, thx.