searx / searx-docker

Create a searx instance using Docker
GNU Affero General Public License v3.0
406 stars 67 forks source link

my slimmed down docker-compose.yml #20

Open travnewmatic opened 4 years ago

travnewmatic commented 4 years ago

i'm no Docker Expert, but i think that your docker-compose.yml and scripts setup is a bit too.. messy. here is my docker-compose.yml:

version: '3.7'

services:

  filtron:
    image: dalf/filtron
    restart: always
    networks:
      - default
      - traefik_default
    command: -listen 0.0.0.0:4040 -api 0.0.0.0:4041 -target searx:8080
    volumes:
      - ./rules.json:/etc/filtron/rules.json:rw
    read_only: true
    cap_drop:
      - ALL
    labels:
      - "traefik.enable=true"
      - "traefik.http.routers.filtron.rule=Host(`searx.travnewmatic.com`)"
      - "traefik.http.routers.filtron.entrypoints=websecure"
      - "traefik.http.routers.filtron.tls.certresolver=mytlschallenge"
      - "traefik.docker.network=traefik_default"
      - "traefik.http.services.filtron.loadbalancer.server.port=4040"

  searx:
    image: searx/searx:latest
    restart: always
    depends_on:
      - filtron
      - morty
    networks:
      - tor-hidden-service_default
      - default
    command: -f
    volumes:
      - ./searx:/etc/searx:rw
    cap_drop:
      - ALL
    cap_add:
      - CHOWN
      - SETGID
      - SETUID
      - DAC_OVERRIDE

  morty:
    image: dalf/morty
    restart: always
    networks:
      - default
      - traefik_default
    command: -listen 0.0.0.0:3000 -timeout 6 -ipv6
    environment:
      - MORTY_KEY=e63wDcpbTfRQj51Utf2BK5Isd6wDh/dD4Z46bmMUno6N
    read_only: true
    cap_drop:
      - ALL
    labels:
      - "traefik.enable=true"
      - "traefik.http.routers.morty.rule=Host(`searx.travnewmatic.com`) && PathPrefix(`/morty`)"
      - "traefik.http.routers.morty.entrypoints=websecure"
      - "traefik.http.routers.morty.tls.certresolver=mytlschallenge"
      - "traefik.docker.network=traefik_default"
      - "traefik.http.services.morty.loadbalancer.server.port=3000"

networks:
  traefik_default:
    external: true
  tor-hidden-service_default:
    external: true

its mostly based on your setup, but i've made a few changes.

are there any glaring issues with what i've done?

thanks for all the hard work on this awesome project, i use it every day!

dalf commented 4 years ago

Sorry for the delay.

Can you explain what is messy ?

Some notes about the current docker-compose.yaml :

using traefik instead of caddy

Thank you. Some notes:

searx is sending queries through a separate TOR container

Which docker image are you using ? What are the parameters ? What is the content of your settings.yml ? The complexity is to let people decide if the requests must be routed through Tor or not: I guess, most/some/few people don't want to use Tor.

watchtower upgrades things when a new image is available (sometimes blowing away my config, but thats why i use a repo)

I have seen this kind of blow away on another project, but I haven't tried on this project. Thank you for the feed back. Is the blow away random or on specific container / case ? Perhaps watchtower would be more stable if it scans only the searx container (I haven't tried).

unixfox commented 4 years ago

Traefik is really a pain to use and configure, it should be better to stay on Caddy.

travnewmatic commented 4 years ago

actually, i've switched to using kubernetes!

https://github.com/travnewmatic/k8s-searx/blob/master/searx.yaml

do you have file similar to https://ssl-config.mozilla.org/#server=traefik&server-version=2.1.2&config=modern ? How do you start traefik ?

i'd have to look for my traefik config. its not anything crazy. its almost identical to https://docs.traefik.io/v2.0/user-guides/docker-compose/acme-tls/

I don't see the configuration about the headers: X-XSS-Protection, Content-Security-Policy, Access-Control, etc... is it in the labels of your docker-compose or traefik.toml ?

this is likely something i've borked. i'll need to take another look at it.

Which docker image are you using ? What are the parameters ? What is the content of your settings.yml ?

always the latest stable image from docker hub. content of settings.yml is whats its in that configmap section of that long k8s yaml thing (i know its hard to read with all the escapes and newlines that kubectl sticks in there).

i've seen all the warnings about using watchtower + latest, so i can't really complain too much. one idea would be to make the searx settings file volume/mount read-only. though i'm not sure if that would fix it. since i've switched to kubernetes, the searx settings configmap thing is ALWAYS read only. updates to the image have no effect on the contents of that configmap.

before i began migrating all my stuff from docker into kubernetes i didnt have any issues with traefik. i didnt really take full advantage of it. i didnt mess with dynamic config. i just had the main traefik docker-compose thing, and then stuck labels on all the things i needed it to proxy for. i started using it right around the time v2 came out so the documentation wasn't 100%, but its been around enough that i think doing most basic things should be easy to deal with.

again, thanks for tall the hard work on this. i really love it and i use it daily. its also been a great learning tool for my sysadmin chops. setting it up helped me to learn more about docker/docker-compose and more recently kubernetes. truly one of my favorite projects out there :)

dalf commented 4 years ago

For reference, there is another traefik configuration:

travnewmatic commented 4 years ago
  • I don't see the configuration about the headers: X-XSS-Protection, Content-Security-Policy, Access-Control, etc... is it in the labels of your docker-compose or traefik.toml ?

this was always a thorn in my side. so i banged on it a bit and came up with this:

apiVersion: extensions/v1beta1
kind: Ingress
metadata:
  annotations:
    cert-manager.io/cluster-issuer: letsencrypt-prod
    nginx.ingress.kubernetes.io/configuration-snippet: |
      more_set_headers "Strict-Transport-Security: max-age=31536000; includeSubDomains; preload";
      more_set_headers "X-XSS-Protection: 1; mode=block";
      more_set_headers "X-Content-Type-Options: nosniff";
      more_set_headers "X-Frame-Options: SAMEORIGIN";
      more_set_headers "Feature-Policy: accelerometer 'none';ambient-light-sensor 'none'; autoplay 'none';camera 'none';encrypted-media 'none';focus-without-user-activation 'none'; geolocation 'none';gyroscope 'none';magnetometer 'none';microphone 'none';midi 'none';payment 'none';picture-in-picture 'none'; speaker 'none';sync-xhr 'none';usb 'none';vr 'none'";
      more_set_headers "Referrer-Policy: no-referrer";
      more_set_headers "X-Robots-Tag: noindex, noarchive, nofollow";
  name: searx
  namespace: searx
spec:
  rules:
  - host: searx.travnewmatic.com
    http:
      paths:
      - backend:
          serviceName: morty
          servicePort: 3000
        path: /morty/
        pathType: ImplementationSpecific
      - backend:
          serviceName: searx
          servicePort: 8080
        path: /
        pathType: ImplementationSpecific
  tls:
  - hosts:
    - searx.travnewmatic.com
    secretName: searx

yields

tnewman@keelung1:~/searx/k8s-searx$ curl -I https://searx.travnewmatic.com/
HTTP/2 200 
server: nginx/1.19.2
date: Thu, 29 Oct 2020 09:45:51 GMT
content-type: text/html; charset=utf-8
content-length: 10542
vary: Accept-Encoding
server-timing: total;dur=3.867
strict-transport-security: max-age=31536000; includeSubDomains; preload
x-xss-protection: 1; mode=block
x-content-type-options: nosniff
x-frame-options: SAMEORIGIN
feature-policy: accelerometer 'none';ambient-light-sensor 'none'; autoplay 'none';camera 'none';encrypted-media 'none';focus-without-user-activation 'none'; geolocation 'none';gyroscope 'none';magnetometer 'none';microphone 'none';midi 'none';payment 'none';picture-in-picture 'none'; speaker 'none';sync-xhr 'none';usb 'none';vr 'none'
referrer-policy: no-referrer
x-robots-tag: noindex, noarchive, nofollow

This isn't a 1:1 of your https://github.com/searx/searx-docker/blob/master/Caddyfile headers, but it's closer than before. I think to get it to be a perfect copy of your Caddyfile, i'd need to create individual ingresses for searx and morty.

What are your thoughts on the headers that i have included? Any glaring omissions?

tomlawesome commented 3 years ago

using traefik instead of caddy

* do you have file similar to https://ssl-config.mozilla.org/#server=traefik&server-version=2.1.2&config=modern ? How do you start traefik ?

* I don't see the configuration about the headers: X-XSS-Protection, Content-Security-Policy, Access-Control, etc... is it in the labels of your docker-compose or traefik.toml ?

Hi,

Some replies:

middlewares.yml

http:
  middlewares:
    middlewares-rate-limit:
      rateLimit:
        average: 100
        burst: 50
    searx-ratelimit:
      rateLimit:
        average: 0
        burst: 0
        period: 1s
    middlewares-secure-headers:
      headers:
        accessControlAllowMethods:
          - GET
          - OPTIONS
          - PUT
        accessControlAllowOriginList:
          - 'https://bar.example.com'
          - 'https://foo.example.com'
          - 'https://misc.example.com'
          - 'https://sub1.me'
          - 'https://sub2.example.com'
          - 'https://sub3.example.com'
        accessControlAllowHeaders:
          - 'Content-Type'
          - 'X-Api-Key'
        addVaryHeader: true
        accessControlMaxAge: 100
        hostsProxyHeaders:
          - "X-Forwarded-Host"
          - "Cf-Connecting-Ip"
        sslRedirect: true
        stsSeconds: 63072000
        stsIncludeSubdomains: true
        stsPreload: true
        forceSTSHeader: true
  # Gets overwritten by customFrameOptionsValue     
        frameDeny: true
  # CustomFrameOptionsValue overwrites frameDeny to allow specific urls to load frames
        customFrameOptionsValue: "allow-from https:prox.example.com"
        contentTypeNosniff: true
        browserXssFilter: true
        # sslForceHost: true
        # sslHost: ""
        referrerPolicy: "strict-origin-when-cross-origin"
        contentSecurityPolicy: "frame-ancestors *.example.com;block-all-mixed-content;default-src *.example.com;script-src *.example.com 'unsafe-eval' 'unsafe-inline';style-src *  'unsafe-inline' *.example.com;frame-src *.example.com;img-src 'self' *.example.com data: 'unsafe-eval';font-src fonts.gstatic.com  *.example.com;connect-src 'self' *.example.com;manifest-src 'self';base-uri 'self';form-action 'self' *.example.com;media-src *.example.com;"
        featurePolicy: "camera 'none'; geolocation 'none'; microphone 'none'; payment 'none'; usb 'none'; vr 'none';"
        customResponseHeaders:
          X-Robots-Tag: "none,noarchive,nosnippet,notranslate,noimageindex,"
          server: ""

traefik excerpt from docker-compose.yml

  traefik:
    image: traefik:2.4
    container_name: traefik
    restart: always
    read_only: true
    networks:
      - YourNetwork
    ports:
      - 80:80
      - 443:443
    security_opt:
      - no-new-privileges:true
    volumes:
      - /etc/localtime:/etc/localtime:ro
      - $DOCKERDIR/traefik/traefik.yml:/traefik.yml:ro
      - $DOCKERDIR/traefik/rules:/rules:ro
      - $DOCKERDIR/traefik/acme.json:/acme.json:rw
      - $DOCKERDIR/traefik/logs:/logs:rw
      - $DOCKERDIR/traefik/plugins-storage:/plugins-storage:rw
      - $DOCKERDIR/certs:/certs:ro
    environment:
      - TZ=$TZ
    profiles:
      - core
    labels:
      - traefik.enable=true
      - traefik.http.routers.traefik-all.entrypoints=https,http
      - traefik.http.routers.traefik-all.rule=Host(`dash.$DOMAINNAME`)
      - traefik.http.routers.traefik-all.service=api@internal
  #    - traefik.http.routers.traefik-all.tls.certresolver=zerossl     # One at a time because using the same key 'traefik.http.routers.traefik-all.tls.certresolver'
  #    - traefik.http.routers.traefik-all.tls.certresolver=dns        # One at a time because using the same key 'traefik.http.routers.traefik-all.tls.certresolver'
      - traefik.http.routers.http-catchall.entrypoints=http
      - traefik.http.routers.http-catchall.rule=HostRegexp(`{host:.+}`)
      - traefik.http.routers.http-catchall.middlewares=redirect-to-https
      - traefik.http.middlewares.redirect-to-https.redirectscheme.scheme=https

traefik.yml

global:
  sendAnonymousUsage: false

api:
  dashboard: true

entryPoints:
  http:
    address: ":80"
    forwardedHeaders:
      trustedIPs:
        - 127.0.0.1/32
        - 192.168.0.0/16
        - 172.16.0.0/12
        - 10.0.0.0/8
        - 103.21.244.0/22
        - 103.22.200.0/22
        - 103.31.4.0/22
        - 141.101.64.0/18
        - 108.162.192.0/18
        - 190.93.240.0/20
        - 188.114.96.0/20
        - 197.234.240.0/22
        - 198.41.128.0/17
        - 162.158.0.0/15
        - 104.16.0.0/12
        - 172.64.0.0/13
        - 131.0.72.0/22
  https:
    address: ":443"
    http:
      tls: {}
    forwardedHeaders:
      trustedIPs:
        - <For me, Cloudflare IP>
    proxyProtocol:
      trustedIPs:
        - <For me, Cloudflare IP>

tls:
  stores:
    default: {}

providers:
  docker:
    watch: true
    endpoint: "tcp://<your socket proxy service name>:2375"
    exposedByDefault: false
    defaultRule: "Host(`{{ index .Labels \"com.docker.compose.service\"}}.example.com`)"
    network:
      - YourNetwork

  file:
    directory: /rules/
    watch: true

certificatesResolvers:
  http:
    acme:
      email: sysadmin@example.com
      storage: acme.json
      httpChallenge:
        entryPoint: http

  zerossl:
    acme:
      caServer: https://acme.zerossl.com/v2/DV90
      email: sysadmin@example.com
      storage: acme.json
      dnsChallenge:
        provider: cloudflare
        resolvers:
          - "1.1.1.1:53"
          - "1.0.0.1:53"
      eab:
        kid: <key>
        hmacEncoded: <key>

  dns:
    acme:
      email: sysadmin@example.com
      storage: acme.json
      dnsChallenge:
        provider: cloudflare
        resolvers:
          - "1.1.1.1:53"
          - "1.0.0.1:53"

Thought I'd drop these in, in case you fancied having a go with Traefik and to give some examples of some of the things you asked about. It's entirely possible to do it all with traefik.

The argument over whether Caddy or Traefik is better is somewhat of a moot point I think; whichever you're familiar with the most is going to be the best for you. I would say that more people are familiar with Traefik though, and traefik has excellent docker integration.