Open animedbz16 opened 2 weeks ago
Hello @animedbz16
I can see two topics in your issue here: one about Telemetry and one about the use of volumes regarding logs and most outputs generated by the platform
Regarding telemetry it seems there's a mistake in the documentation. I've been confirmed that it's actually up by default and for now there's no opt out option, but indeed if you're in an air gapped infrastructure then it shouldn't be an issue
Concerning the volume usage : from what I understand from the Docker doc about volumes, if a volume is declared in the dockerfile but you don't pass the arguments to mount a name volume, Docker ensure it has unique name to store the relevant data to disk
I understand that in your use case you're capturing logs directly from stdout but there are several ways to make use of logs outputs in a file also, so at best we're covering every cases here.
If you don't want to deal with random hashes you can create and dedicate a volume on your hosts and explicitly bind it to the volumes in the Dockerfile like in this example or the hard way is to remove the volume declaration and rebuild the image on your side if you can sustain this way
Also if you're using docker-compose you might want to use docker-compose start instead of up everytime if you don't want to rerun the containers from scratch (if this fits your usage of course)
Hope this helps
Description
I had not realized this until looking at restructuring our deployment, but it appears that the OpenCTI platform is defining docker volumes within its base Dockerfile which when deployed creates random docker volumes names with sha256 hashes.
The issue is that when deploying with docker compose and doing a down / up, these volumes will remain on the system and become detached from the old containers and when the deployment is upped, then new volumes with different random sha256 hashes are created and attached to the new containers.
https://github.com/OpenCTI-Platform/opencti/blob/c1def1b7d63b1e8a8ef405753ce088a38182a933/opencti-platform/Dockerfile#L92
Its not clear to me why docker volumes are being leveraged here for this to begin with.
The logs are being piped into stdout so they can be viewed from docker logs, we leverage filebeat to discover any docker containers on systems and stream anything piped out from the stdout from docker into Elasticsearch for retaining these logs.
It doesn't seem to make any sense for the application to capture and retain any of this information inside the container itself for logging.
Telemetry metrics are supposedly disabled by default according to the docs:
Similarly, it appears that the .support directory is capturing errors, which are already being captured in the docker container stdout. It is not clear why this is being stored on disk inside the container on a volume mount.
This problem is compounded when spinning up OpenCTI platform replica servers for scaling up to multiple platform nodes
Environment
Reproducible Steps
Steps to create the smallest reproducible scenario:
Expected Output
OpenCTI platform should not be creating docker volumes to store logs within the container on a disk / volume mount
Actual Output
OpenCTI platform is unnecessarily storing logs inside the container on a disk / volume mount. These volumes will persist after using docker compose up / down and every up will create new volumes which ever increases the number of volumes being retained on the server until it is manually cleaned up.
Additional information
This problem is compounded when spinning up OpenCTI platform replica servers for scaling up to multiple platform nodes
Screenshots (optional)