immich-app / immich

High performance self-hosted photo and video management solution.
https://immich.app
GNU Affero General Public License v3.0
45.08k stars 2.18k forks source link

Metric Endpoint can't be scraped #10209

Closed simonhoellein closed 3 months ago

simonhoellein commented 3 months ago

The bug

I am trying to scrape the Prometheus endpoints from the immich-server with the newrelic-prometheus integration. This is also running as a docker container on the same host with access to the network from the immich-server.

Scraping the metrics with curl from the cli of the docker host works fine, but if the prometheus agent trys to aceess the metrics page (http://immich-server:8081/metrics) it gets an HTTP403: Forbidden.

Is it possible that the metrics endpoint only allow certain clients?

The OS that Immich Server is running on

Ubuntu Server 22.04 LTS

Version of Immich Server

v1.106.2

Version of Immich Mobile App

v1.106.1

Platform with the issue

Your docker-compose.yml content

name: immich-app

services:
  immich-server:
    container_name: immich_server
    image: ghcr.io/immich-app/immich-server:release
    volumes:
      - ${UPLOAD_LOCATION}:/usr/src/app/upload
      - /etc/localtime:/etc/localtime:ro
    env_file:
      - .env
    ports:
      - 2283:3001
      - 8081:8081
      - 8082:8082
    environment:
      - NODE_ENV=production
    depends_on:
      - redis
      - database
    restart: always
    networks:
      - immich-frontend
      - immich-backend

  immich-machine-learning:
    container_name: immich_machine_learning
    image: ghcr.io/immich-app/immich-machine-learning:release
    volumes:
      - ${UPLOAD_LOCATION}:/usr/src/app/upload
      - model-cache:/cache
    env_file:
      - .env
    environment:
      - NODE_ENV=production
    restart: always
    networks:
      - immich-backend

  redis:
    container_name: immich_redis
    image: redis:6.2-alpine
    healthcheck:
      test: redis-cli ping || exit 1
    restart: always
    networks:
      - immich-backend

  database:
    container_name: immich_postgres
    image: tensorchord/pgvecto-rs:pg14-v0.2.0@sha256:90724186f0a3517cf6914295b5ab410db9ce23190a2d9d0b9dd6463e3fa298f0
    env_file:
      - .env
    environment:
      POSTGRES_PASSWORD: ${DB_PASSWORD}
      POSTGRES_USER: ${DB_USERNAME}
      POSTGRES_DB: ${DB_DATABASE_NAME}
      PG_DATA: /var/lib/postgresql/data
    volumes:
      - pgdata:/var/lib/postgresql/data
    command: ["postgres", "-c" ,"shared_preload_libraries=vectors.so", "-c", 'search_path="$$user", public, vectors', "-c", "logging_collector=on", "-c", "max_wal_size=2GB", "-c", "shared_buffers=512MB", "-c", "wal_compression=on"]
    restart: always
    networks:
      - immich-backend

volumes:
  pgdata:
  model-cache:

networks:
  immich-frontend:
    name: immich-frontend
    driver: bridge
  immich-backend:
    name: immich-backend
    driver: bridge

Your .env content

###################################################################################
# Database
###################################################################################

DB_HOSTNAME=immich_postgres
DB_USERNAME=postgres
DB_PASSWORD=postgres
DB_DATABASE_NAME=immich

# Optional Database settings:
# DB_PORT=5432

###################################################################################
# Redis
###################################################################################

REDIS_HOSTNAME=immich_redis

# Optional Redis settings:

# Note: these parameters are not automatically passed to the Redis Container
# to do so, please edit the docker-compose.yml file as well. Redis is not configured
# via environment variables, only redis.conf or the command line

# REDIS_PORT=6379
# REDIS_DBINDEX=0
# REDIS_PASSWORD=
# REDIS_SOCKET=

###################################################################################
# Upload File Location
#
# This is the location where uploaded files are stored.
###################################################################################

UPLOAD_LOCATION=/opt/immich-app/assets/upload

###################################################################################
# Typesense
###################################################################################
TYPESENSE_API_KEY=Z5RNcoUZ8dnuDH4*************************************************
TYPESENSE_ENABLED=true

###################################################################################
# Reverse Geocoding
#
# Reverse geocoding is done locally which has a small impact on memory usage
# This memory usage can be altered by changing the REVERSE_GEOCODING_PRECISION variable
# This ranges from 0-3 with 3 being the most precise
# 3 - Cities > 500 population: ~200MB RAM
# 2 - Cities > 1000 population: ~150MB RAM
# 1 - Cities > 5000 population: ~80MB RAM
# 0 - Cities > 15000 population: ~40MB RAM
####################################################################################

DISABLE_REVERSE_GEOCODING=false
REVERSE_GEOCODING_PRECISION=3

####################################################################################
# WEB - Optional
#
# Custom message on the login page, should be written in HTML form.
# For example:
# PUBLIC_LOGIN_PAGE_MESSAGE="This is a demo instance of Immich.<br><br>Email: <i>demo@demo.de</i><br>Password: <i>demo</i>"
####################################################################################

PUBLIC_LOGIN_PAGE_MESSAGE=

####################################################################################
# Alternative Service Addresses - Optional
#
# This is an advanced feature for users who may be running their immich services on different hosts.
# It will not change which address or port that services bind to within their containers, but it will change where other services look for their peers.
# Note: immich-microservices is bound to 3002, but no references are made
####################################################################################

IMMICH_WEB_URL=http://immich-web:3000
IMMICH_SERVER_URL=http://immich-server:3001
IMMICH_MACHINE_LEARNING_URL=http://immich-machine-learning:3003

####################################################################################
# Alternative API's External Address - Optional
#
# This is an advanced feature used to control the public server endpoint returned to clients during Well-known discovery.
# You should only use this if you want mobile apps to access the immich API over a custom URL. Do not include trailing slash.
# NOTE: At this time, the web app will not be affected by this setting and will continue to use the relative path: /api
# Examples: http://localhost:3001, http://immich-api.example.com, etc
####################################################################################
#IMMICH_API_URL_EXTERNAL=http://localhost:3001

IMMICH_METRICS=true

Reproduction steps

1. Deploy newrelic/nri-prometheus:latest alongside immich with access to the network of immich-server
2. config targts for NewRelic:

targets:
  - description: immich-server
    urls: ["http://immich-server:8081"]
  - description: immich-microservice
    urls: ["http://immich-server:8082"]
  1. Exec in the NewRelic docker container and run wget http://immich-server:8081/metrics

Relevant log output

ping from NewRelic to immich-server:

~ $ ping immich-server
PING immich-server (172.21.0.2): 56 data bytes
64 bytes from 172.21.0.2: seq=0 ttl=42 time=0.080 ms
64 bytes from 172.21.0.2: seq=1 ttl=42 time=0.087 ms
^C
--- immich-server ping statistics ---
2 packets transmitted, 2 packets received, 0% packet loss
round-trip min/avg/max = 0.080/0.083/0.087 ms

---

log-output from Newrelic:
root@docker-ext-2:~# docker logs 2257b8c17b91
time="2024-06-12T13:03:24Z" level=info msg="Starting New Relic's Prometheus OpenMetrics Integration version 2.21.3"
2024/06/12 13:03:24.735411 {"err":"unexpected post response code: 403: Forbidden"}
2024/06/12 13:03:32.611559 {"err":"unexpected post response code: 403: Forbidden"}
2024/06/12 13:03:55.212378 {"err":"unexpected post response code: 403: Forbidden"}
2024/06/12 13:04:02.312164 {"err":"unexpected post response code: 403: Forbidden"}
2024/06/12 13:04:24.609963 {"err":"unexpected post response code: 403: Forbidden"}
2024/06/12 13:04:32.725422 {"err":"unexpected post response code: 403: Forbidden"}
2024/06/12 13:04:55.215371 {"err":"unexpected post response code: 403: Forbidden"}
2024/06/12 13:05:02.014428 {"err":"unexpected post response code: 403: Forbidden"}

Additional information

No response

bo0tzz commented 3 months ago

I can't reproduce this. Any chance there connection is not direct and there's something else returning the 403? Can you get the metrics endpoint from a browser?

simonhoellein commented 3 months ago

hi @bo0tzz,

thanks for your reply! The connection between the two containers should be direct as they are in the same docker network:

root@docker-ext-2:~# docker network inspect immich-frontend
[
    {
        "Name": "immich-frontend",
        "Id": "b512518366f4d53eb3a54294a0f4f6456d4bb53a7f652a095ca32f3b2dfb0dc2",
        "Created": "2024-06-12T15:43:47.62235189Z",
        "Scope": "local",
        "Driver": "bridge",
        "EnableIPv6": false,
        "IPAM": {
            "Driver": "default",
            "Options": null,
            "Config": [
                {
                    "Subnet": "172.30.0.0/16",
                    "Gateway": "172.30.0.1"
                }
            ]
        },
        "Internal": false,
        "Attachable": false,
        "Ingress": false,
        "ConfigFrom": {
            "Network": ""
        },
        "ConfigOnly": false,
        "Containers": {
            "969853fd79336a5a630c630fef5a92022d4c2a8c4ad901b3d68434b350107e43": {
                "Name": "nri-prometheus",
                "EndpointID": "7ae6eef03a20909e5dd37f72cd5fb60e2ec113a43cf753f6297234ac8569f520",
                "MacAddress": "02:42:ac:1e:00:03",
                "IPv4Address": "172.30.0.3/16",
                "IPv6Address": ""
            },
            "caf714457b80113fa74b5499273503de84f139303e27e57575105a06718f0965": {
                "Name": "immich_server",
                "EndpointID": "0174d0405fe1bdeb50cbcf7e77ec1b56b7eb626444d70b8a0f87511e21c324fd",
                "MacAddress": "02:42:ac:1e:00:02",
                "IPv4Address": "172.30.0.2/16",
                "IPv6Address": ""
            }
        },
        "Options": {},
        "Labels": {
            "com.docker.compose.network": "immich-frontend",
            "com.docker.compose.project": "immich-app",
            "com.docker.compose.version": "2.27.1"
        }
    }
]

I've deployed another container with the same network config as the NewRelic container and did a traceroute. They should have direct connection...

root@023d97f0997a:/# traceroute immich-server
traceroute to immich-server (172.30.0.2), 30 hops max, 60 byte packets
 1  immich_server.immich-frontend (172.30.0.2)  0.149 ms  0.028 ms  0.023 ms

When I curl from the other container in the same network i get:

root@023d97f0997a:/# curl http://immich-server:8081/metrics
# HELP target_info Target metadata
# TYPE target_info gauge
target_info{service_name="immich",telemetry_sdk_language="nodejs",telemetry_sdk_name="opentelemetry",telemetry_sdk_version="1.24.1",service_version="1.106.2",process_pid="19",process_executable_name="immich-api",process_executable_path="/usr/local/bin/node",process_command_args="[\"/usr/local/bin/node\",\"/usr/src/app/dist/workers/api.js\"]",process_runtime_version="20.14.0",process_runtime_name="nodejs",process_runtime_description="Node.js",process_command="/usr/src/app/dist/workers/api.js",process_owner="root",host_name="caf714457b80",host_arch="arm64"} 1
# HELP http_server_duration Measures the duration of inbound HTTP requests.
# UNIT http_server_duration ms
# TYPE http_server_duration histogram

[...]

When I curl from the docker host to the forwarded endpoint I get:

root@docker-ext-2:~# curl http://localhost:8081/metrics
# HELP target_info Target metadata
# TYPE target_info gauge
target_info{service_name="immich",telemetry_sdk_language="nodejs",telemetry_sdk_name="opentelemetry",telemetry_sdk_version="1.24.1",service_version="1.106.2",process_pid="19",process_executable_name="immich-api",process_executable_path="/usr/local/bin/node",process_command_args="[\"/usr/local/bin/node\",\"/usr/src/app/dist/workers/api.js\"]",process_runtime_version="20.14.0",process_runtime_name="nodejs",process_runtime_description="Node.js",process_command="/usr/src/app/dist/workers/api.js",process_owner="root",host_name="caf714457b80",host_arch="arm64"} 1
# HELP http_server_duration Measures the duration of inbound HTTP requests.
# UNIT http_server_duration ms
# TYPE http_server_duration histogram

[...]

With curl, I don't have any problems accessing the metrics, but for some reason the NewRelic Browser Agent has. Could it be that only certain browser agents are allowed to access the /metrics path from the immich-server?

simonhoellein commented 3 months ago

for your convenience, this is the compose file I've used to run the NewRelic container:

name: nri-prometheus

x-default-logging: &logging
  driver: "json-file"
  options:
    max-size: "5m"
    max-file: "2"
    tag: "{{.Name}}"

services:
  nri-prometheus:
    container_name: nri-prometheus
    image: newrelic/nri-prometheus:latest
    volumes:
      - ./nri-config.yaml:/config.yaml
    environment:
      - LICENSE_KEY="eu01xxabe***************************"
    networks:
      - monitoring
      - immich-frontend
    restart: always

networks:
  monitoring:
    name: newrelic-monitoring
    driver: bridge

  immich-frontend:
    external: true
bo0tzz commented 3 months ago

Could it be that only certain browser agents are allowed to access the /metrics path from the immich-server?

Almost certainly not.

I have no idea why this might be happening, but since curl is working fine, this seems more like a newrelic issue 🤔

bo0tzz commented 3 months ago

Looking at the error again:

unexpected post response code

This sounds like newrelic is trying to send a POST request? That definitely won't work.

simonhoellein commented 3 months ago

mabe i need to investigate further. Thanks for your help and patience!