OpenCTI-Platform / opencti

Open Cyber Threat Intelligence Platform
https://opencti.io
Other
6.29k stars 930 forks source link

OpenCTI Platform Docker image is creating unnamed volumes for logs, telemetry, and .support directories #8583

Open animedbz16 opened 2 weeks ago

animedbz16 commented 2 weeks ago

Description

I had not realized this until looking at restructuring our deployment, but it appears that the OpenCTI platform is defining docker volumes within its base Dockerfile which when deployed creates random docker volumes names with sha256 hashes.

The issue is that when deploying with docker compose and doing a down / up, these volumes will remain on the system and become detached from the old containers and when the deployment is upped, then new volumes with different random sha256 hashes are created and attached to the new containers.

https://github.com/OpenCTI-Platform/opencti/blob/c1def1b7d63b1e8a8ef405753ce088a38182a933/opencti-platform/Dockerfile#L92

Its not clear to me why docker volumes are being leveraged here for this to begin with.

The logs are being piped into stdout so they can be viewed from docker logs, we leverage filebeat to discover any docker containers on systems and stream anything piped out from the stdout from docker into Elasticsearch for retaining these logs.

It doesn't seem to make any sense for the application to capture and retain any of this information inside the container itself for logging.

Telemetry metrics are supposedly disabled by default according to the docs:

Similarly, it appears that the .support directory is capturing errors, which are already being captured in the docker container stdout. It is not clear why this is being stored on disk inside the container on a volume mount.

This problem is compounded when spinning up OpenCTI platform replica servers for scaling up to multiple platform nodes

Environment

  1. OS (where OpenCTI server runs): Docker
  2. OpenCTI version: 6.3.1
  3. OpenCTI client: N/A
  4. Other environment details:

Reproducible Steps

Steps to create the smallest reproducible scenario:

  1. Start OpenCTI with docker compose up
  2. OpenCTI platform creates sha256 named volumes
  3. Down docker compose
  4. Start OpenCTI with docker compose up
  5. OpenCTI platform creates new sha256 named volumes

Expected Output

OpenCTI platform should not be creating docker volumes to store logs within the container on a disk / volume mount

Actual Output

OpenCTI platform is unnecessarily storing logs inside the container on a disk / volume mount. These volumes will persist after using docker compose up / down and every up will create new volumes which ever increases the number of volumes being retained on the server until it is manually cleaned up.

Additional information

This problem is compounded when spinning up OpenCTI platform replica servers for scaling up to multiple platform nodes

Screenshots (optional)

troll-os commented 1 week ago

Hello @animedbz16

I can see two topics in your issue here: one about Telemetry and one about the use of volumes regarding logs and most outputs generated by the platform

Regarding telemetry it seems there's a mistake in the documentation. I've been confirmed that it's actually up by default and for now there's no opt out option, but indeed if you're in an air gapped infrastructure then it shouldn't be an issue

Concerning the volume usage : from what I understand from the Docker doc about volumes, if a volume is declared in the dockerfile but you don't pass the arguments to mount a name volume, Docker ensure it has unique name to store the relevant data to disk

I understand that in your use case you're capturing logs directly from stdout but there are several ways to make use of logs outputs in a file also, so at best we're covering every cases here.

If you don't want to deal with random hashes you can create and dedicate a volume on your hosts and explicitly bind it to the volumes in the Dockerfile like in this example or the hard way is to remove the volume declaration and rebuild the image on your side if you can sustain this way

Also if you're using docker-compose you might want to use docker-compose start instead of up everytime if you don't want to rerun the containers from scratch (if this fits your usage of course)

Hope this helps