FORTH-ICS-INSPIRE / artemis

ARTEMIS: Real-Time Detection and Automatic Mitigation for BGP Prefix Hijacking. This is the main ARTEMIS repository that composes artemis-frontend, artemis-backend, artemis-monitor and other needed containers.
BSD 3-Clause "New" or "Revised" License
306 stars 44 forks source link

incompatible with podman-compose - alpine musl gethostbyname and gethostbyaddr only return single ip #672

Open coredump17 opened 1 year ago

coredump17 commented 1 year ago

Describe the bug

Podman 4 has DNS and 'should' be compatible withh docker. podman-compose up -d shows below error for containers configuration and autostarter.

File "core/autostarter.py", line 269, in core.autostarter.AutostarterWorker.check_and_control_services File "/usr/local/lib/pyenv/versions/3.6.8/lib/python3.6/site-packages/artemis_utils/service.py", line 37, in service_to_ips_and_replicas_in_compose replica_name = "{}-{}".format(base_service_name, replica_name_match.group(1)) AttributeError: 'NoneType' object has no attribute 'group'

upon investigation it would appear that this issue is caused by podman having multiple PTR records for an IP- container id, container name. alpine, which uses musl, only returns one host or IP per call which is unexpected, as you would only find one replica. PTR lookups would never match the container ID in podmans case as it has multiple entries - see below:

bash-4.4# dig +short configuration 10.89.0.97

bash-4.4# dig +short -x 10.89.0.97 artemis_configuration_1. configuration. 5cf0e12b159a. <---- this will always be returned

The above PTR lookup will not match the below regex in 'service_to_ips_and_replicas_in_compose' call.

    r"^"
    + re.escape(COMPOSE_PROJECT_NAME)
    + r"[_|-]"
    + re.escape(base_service_name)
    + r"[_|-](\d+)",
    replica_host_by_addr,
  )

If dns calls using the socket module only return one value every time, i believe this would limit the platform to one replica.

Affected Component(s)

To Reproduce Steps to reproduce the behavior:

  1. centos/rocky/redhat/alma OS 8+
  2. yum install podman
  3. enable epel repo
  4. yum install podman compose
  5. ** pull repo, cd artimis ; podman-compose up -d
  6. ui starts but admin/system page errors
  7. podman logs configuration or podman logs autostarter show error

Expected behavior i would expect a DNS lookup to respond with all entries not just the last one. i believe this would impact the replica set if wishh to run > 1 container of the same kind. It also caused the solution to be incompatible with podman.

Screenshots File "core/autostarter.py", line 269, in core.autostarter.AutostarterWorker.check_and_control_services File "/usr/local/lib/pyenv/versions/3.6.8/lib/python3.6/site-packages/artemis_utils/service.py", line 37, in service_to_ips_and_replicas_in_compose replica_name = "{}-{}".format(base_service_name, replica_name_match.group(1)) AttributeError: 'NoneType' object has no attribute 'group'

System (please complete the following information):

Additional context alpine uses musl which acts differently to glibc and appears to only return one dns entry per lookup.

coredump17 commented 1 year ago

To workaround the above issue we can use dnspython to perform our dns lookups.

copy /usr/local/lib/pyenv/versions/3.6.8/lib/python3.6/site-packages/artemis_utils/service.py locally. The below code replaces ' service_to_ips_and_replicas_in_compose' with a dnspython definition

artemis_utils will require dnspython module

` def resolve_dns(query:str, rtype:str = 'A', timeout:int = 2)->list: rtype.upper() resolver = dns.resolver.Resolver() if rtype == "PTR": query = dns.reversename.from_address(query) msg = dns.message.make_query(query,rtype) for dns_server in resolver.nameservers: try: resp = dns.query.udp(msg,dns_server,timeout=timeout) if resp.answer: return [str(a) for a in resp.answer[0] ] except Exception as e: log.error("error:",dns_server, e) return []

def service_to_ips_and_replicas_in_compose(own_service_name, base_service_name): local_ip = get_local_ip() address_regexp = re.compile ('\d{1,3}.\d{1,3}.\d{1,3}.\d{1,3}') service_to_ips_and_replicas_set = set([]) addr_infos = resolve_dns(base_service_name) for replica_ip in addr_infos:

do not include yourself

    if base_service_name == own_service_name and replica_ip == local_ip:
        continue
    ptr = resolve_dns(replica_ip, 'PTR')
    for replica_host_by_addr in ptr:
      replica_name_match = re.match(
        r"^"
        + re.escape(COMPOSE_PROJECT_NAME)
        + r"[_|-]"
        + re.escape(base_service_name)
        + r"[_|-](\d+)",
        replica_host_by_addr,
      )
      if replica_name_match:
        replica_name = "{}-{}".format(base_service_name, replica_name_match.group(1))
        service_to_ips_and_replicas_set.add((replica_name, replica_ip))
return service_to_ips_and_replicas_set

`

vkotronis commented 1 year ago

@mooneym17 thanks for reporting this! Could you issue a PR with the potential fix? We will also update Artemis utils accordingly with the fix. Thank you!