docker service name resolution

kwitsch commented 3 years ago

Feature request:

Resolve bootstrap DNS queries which aren't FQDN against 127.0.0.11 instead of bootstrapDNS configuration(enable via configuration and disabled as default).

Background:

In docker stacks service names are resolved through 127.0.0.11. Currently it has to be the bootstrap DNS configuration if a service is used as an upstream resolver. With this workaround the list download DNS resolution is done through the host DNS configuration. This is a problem if blocky is set as DNS resolver for the host. The proposed feature would eliminate this problem as service names aren't typically FQDN.

Possible FQDN check function: link

0xERR0R commented 3 years ago

Can you share your test setupb(e. g. Docker-compose file) please?

kwitsch commented 3 years ago

Striped down version of my Test Stack: example-stack.yml.txt config.yml.txt

Tested with ´docker stack deploy -c example-stack.yml DNS´

0xERR0R commented 3 years ago

Did you try to set the "dns" option in your stack/compose file and remove "bootstrapDns" from config.yaml? I think, it will work and this is the preferred solution for docker setup. In this case you will docker manage your dns configuration and internal docker resolution will also work:

version: "3.8"

services:
  unbound:
    image: kwitsch/unbound:monthly
  blocky:
    image: spx01/blocky
    environment:
      TZ: Europe/Berlin
    ports:
      - "53:53/tcp"
      - "53:53/udp"
    volumes:
      - ./config.yml:/app/config.yml
    dns: 1.1.1.1

kwitsch commented 3 years ago

This would possibly work but isn't the solution I'm trying to achieve. Sorry my text seems a bit confusing 😅

I like to resolve the upstream resolver through the docker name resolution and afterwards all queries(for the lists) through the default upstream resolver.

My goal is to resolve as much queries as possible locally. But I get the problem with my proposed solution.

Maybe I should download the lists through a separate service and blocky gets it's lists from this service 🤔

0xERR0R commented 3 years ago

Ok, I understand what your goal is. I think, this is not a common problem, but it could be useful under certain circumstances. Maybe we need something like a "bootstrap resolver with fallback", something like:

If "bootstrapDns" is defined, try this. If domain resolution not possible, try 2.
try system resolver (in your case docker). If domain resolution not possible, try 3.
try default upstream resolver if there any defined with only IP address.

kwitsch commented 3 years ago

It seems a bit complicated for a very limited usecase.

A more generic approach would be to use the bootstrapDNS resolver to upstart the default upstream resolvers and use these for all blocky DNS queries afterwards.

Is there a specific reason to not resolve the queries for the list downloads through the default upstream group?

0xERR0R commented 3 years ago

What is if someone use internal infrastructure to serve blocklists (e.g. docker container)? It this case you need internal resolver to resolve the download url and upstream resolvers are typically for external urls only.

I'm trying to find a generic approach without many exceptions.

kwitsch commented 3 years ago

Ok i get the problem(I'm planning on doing just that later 😅)

I would have tried to combat this problem with a coditional block like:

conditional:
  mapping:
    blacklist: 127.0.0.11

kwitsch commented 2 years ago

I build a solution appropriate for my desired configuration https://github.com/kwitsch/docker-adblocklists (sorry it isn't documented yet)

Example Stack: completestack-compose.yml.txt config.yml.txt

It works well but has a very rough upstart. As blocky tries to resolve the service name before the service is actually started(&registered in the docker DNS resolver), it crashes at the first try.

As the current version of blocky has no mechanic to accommodate this problem I'd like to propose three changes:

list_cache.go line 203 make the max attempt configurable
list_cache.go line 219 make the time between retries configurable
list_cache.go line 208 reduce attempt counter(attempt--) else if resp.StatusCode == http.StatusTooEarly

The first two proposals would make the blocky startup more customizable. The third proposal would let a list acumuleator inform blocky that it is already up but still need some time to provide a list.

0xERR0R commented 2 years ago

Hi,

Regarding your suggestions: It is a good idea to make some constants configurable to make blocky more robust. I'm not sure if the http status "Too Early" is common used to indicate that the server is starting...

With this configuration you can mitigate the the effects, but this is IMHO not the right solution. I would recommend you to use health_check with "depend_on" constructs from docker-compose (I'm not familiar with docker stack, I hope it works the same way as compose):

Define Healthckeck in your "adblocklists" container. This should return "0" if the server is ready to serve the blocklist.
Define "depend_on" with condition for blocky container, e.g.
```
depends_on:
adblocklists:
condition: service_healthy
```

With this changes, the blocky container will be startet only if the "adblocklists" service is up and running.

kwitsch commented 2 years ago

Hi, Thanks for the feedback.

& 2. would help me a lot.

I get that the status TooEarly is not comonly used. I was trying to stop blocky from upstarting with empty lists. Currently it will start if there are lists configured but all download attempts fail(0 entries). Maybe a configurable "minimal entries threshold" could solve this(he service will be terminated if the upstart download delivers less entries).

The suggested solution utilizing depends_on won't work in a swarm environment, so it won't be suitable for me. Regarding service upstart docker compose and docker swarm mode differ a bit. By comparing both i got the impression that the service name resolution will be a bit later available than in compose. To solve this problem i retry getting an address in a loop in my DnsCacheRefresh service.

0xERR0R commented 2 years ago

I created #307 and #308 as follow up issues.

Regarding the "minimal entries threshold": What is if you define 5 links and only 1 was successful? In this case you will get some entries in your cache, but there is still a problem with some lists. Maybe it would be a case for external monitoring? For example, we can introduce a new prometheus metrics (like "failed_download_count") and it would be possible to monitor this with Grafana or Alert Manager?

kwitsch commented 2 years ago

Thanks for the enhancement tickets 🙂

"minimal entries threshold" isen't working with multible entries... 🤔 failed_download_count would be greate for monitoring purpose but woulden't really solve the problem(maybe I'm overlooking something).

Maybe a configuration like "initial download required" which overrides the attempts counter from #308 during the upstart download? Like retrying untill all lists could be loaded.

0xERR0R commented 2 years ago

Well, the idea with "failed_download_count" is that it is not a standard case. If error occurs while downloading, there is something wrong. For your case (blocky starts simultaneously with other service which provides blacklists) you can get rid of the problem by playing with retry parameters.

I like the idea with "initial download required" and I think it should be the default behavior (so we don't need this parameter): blocky should fail to start if initial list download fails after configured retry. In most case it will be configuration error (network issue, wrong url etc.). In later rerun, it could be ok, if download fails, the cache will still have entries from previous load and prometheus metric should indicate the error.

kwitsch commented 2 years ago

I like the idea to define this as the default behavior. For most use cases a non blocking blocky is indeed a fault state and should be seen as such.

0xERR0R commented 2 years ago

I created also #309 and #310 to track the task separately. I think this issue can be closed?

0xERR0R / blocky

docker service name resolution #293

Feature request:

Background: