clong / DetectionLab

Automate the creation of a lab environment complete with security tooling and logging best practices
MIT License
4.64k stars 987 forks source link

logger host cannot resolve DNS queries #791

Closed arnydo closed 2 years ago

arnydo commented 2 years ago

Please verify that you are building from an updated Master branch before filing an issue.

Description of the issue:

While building the logger host, I'm running into the following error message that causes the build to stop:

logger: ERROR: '~apt-fast' user or team does not exist.

After some digging, it was determined that the logger host cannot resolve DNS queries. I confirmed that the logger host does have direct IP connectivity and can PING 8.8.8.8, etc. However, when attempting to resolve any DNS name, it returns a SERVFAIL error.

Here are the current networking configs:

vagrant@logger:~$ cat /etc/netplan/01-netcfg.yaml
network:
  version: 2
  ethernets:
    eth0:
      dhcp4: true
    eth1:
      dhcp4: true
      nameservers:
        addresses: [8.8.8.8,8.8.4.4]
vagrant@logger:~$ cat /etc/netplan/50-vagrant.yaml
---
network:
  version: 2
  renderer: networkd
  ethernets:
    eth1:
      addresses:
      - 192.168.56.105/24
vagrant@logger:~$ cat /etc/resolv.conf
# This file is managed by man:systemd-resolved(8). Do not edit.
#
# This is a dynamic resolv.conf file for connecting local clients to the
# internal DNS stub resolver of systemd-resolved. This file lists all
# configured search domains.
#
# Run "resolvectl status" to see details about the uplink DNS servers
# currently in use.
#
# Third party programs must not access this file directly, but only through the
# symlink at /etc/resolv.conf. To manage man:resolv.conf(5) in a different way,
# replace this symlink by a static file or a different symlink.
#
# See man:systemd-resolved.service(8) for details about the supported modes of
# operation for /etc/resolv.conf.

nameserver 8.8.8.8
options edns0 trust-ad
arnydo commented 2 years ago

Noticed this is the same error as seen previously via #765.

The net adapters seem to be working fine, it is that no DNS queries can be made.

Can ping 8.8.8.8 but dig google.com @8.8.8.8 results in SERVFAIL.

This is why the dependencies can't be installed.

rangerrkm commented 2 years ago

Hi arnydo,

I'm having the same issue, all other VMs built without errors, but logger has the same as you are reporting.

My VMnet 2 Network is set as follows:

192.168.56.1 255.255.255.0 no GW

Even tried setting 8.8.8.8 on my Preferred DNS Server, did not help.

So, I will keep digging for a solution.

clong commented 2 years ago

Oh man, I've seen this problem appear sporadically and I've never been able to figure out the root cause. Let me see if I can reproduce it.

clong commented 2 years ago

No dice on reproducing.

image

Here's my settings:

image

image

image

rangerrkm commented 2 years ago

Thanks, Chris for looking into this issue. I can rule out a Virtualbox vs VMware Workstation 16. I'm using VMware Workstation 16 and seeing the same issue. I even deployed a test Ubuntu 20.04 VM and it came right up with both a NAT and VMnet2 nic with DNS resolution working. For as long as I have been using Detection Lab, since the beginning, I have not had issues with the logger VM, it's always with the Windows VMs.

I will keep digging on it and if I do figure it out will let everyone know.

arnydo commented 2 years ago

Thanks for the response @clong ! It looks like my net settings are matching yours. This is indeed odd.

arnydo commented 2 years ago

Here is another interesting observation. I can resolve DNS queries via resolvectl but not dig. Not sure what this indicates... image

rangerrkm commented 2 years ago

That is very interesting, might be another lead into the problem.

On Fri, Mar 25, 2022 at 9:01 AM Kyle Parrish @.***> wrote:

Here is another interesting observation. I can resolve DNS queries via resolvectl but not dig. Not sure what this indicates... [image: image] https://user-images.githubusercontent.com/11653079/160146351-a3863bcb-c81c-4a2e-ad98-16b71f85b1ef.png

— Reply to this email directly, view it on GitHub https://github.com/clong/DetectionLab/issues/791#issuecomment-1079116201, or unsubscribe https://github.com/notifications/unsubscribe-auth/AGAI4HLV5L7RXYQHJWRRZEDVBXIODANCNFSM5RI3D3FA . You are receiving this because you commented.Message ID: @.***>

rangerrkm commented 2 years ago

Hello, All

I have been working on this issue and think that I found the problem, Chris, can you confirm what I have found. Interestingly, not everyone seems to be having the issue, so here is what I have done as a workaround for now.

After testing with another Bento VM, the same box image (bento/ubuntu-20.04): Here is the resolv.conf file without modification.

nameserver 127.0.0.53 options edns0 trust-ad search localdomain

======================================================================

Here is what the resolv.conf file looked like after being deployed with the logger_bootstrap.sh script.

nameserver 8.8.8.8 options edns0 trust-ad

======================================================================

Workaround:

Logger after modifying the logger_script (logger_bootstrap.sh): I commented out lines 11-13.

if grep '127.0.0.53' /etc/resolv.conf; then sed -i 's/nameserver 127.0.0.53/nameserver 8.8.8.8/g' /etc/resolv.conf && chattr +i /etc/resolv.conf fi

Here are the results after modifying the logger_bootstrap.sh script. Here is what the $ /etc/resolv.conf looked like.

nameserver 127.0.0.53 options edns0 trust-ad search localdomain

I also needed to change the DNS Server setting from 8.8.8.8 to 127.0.0.53 on Line 7 in the Vagrant file.

cfg.vm.network :private_network, ip: "192.168.56.105", gateway: "192.168.56.1", dns: "127.0.0.53"

Note: I would recommend that you destroy the logger VM and start over as the --provision seemed to not work for me.

vagrant destroy logger

And it works without issue for me. If arnydo can test and confirm what I have done as a workaround that would be great.

Please let me know if you were successful or not.

Thanks,

arnydo commented 2 years ago

Good morning! I can confirm that the changes you mentioned worked for me as well @rangerrkm.

rangerrkm commented 2 years ago

Glad to hear, so Chris if you need me to explain more of my troubleshooting process, please let me know. I do know that it might have been a change on the Bento Box Ubuntu or some Netplan change.

Thanks,

clong commented 2 years ago

Hey folks, I keep meaning to comment on this thread and forgetting. @arnydo and @rangerrkm - thanks for the debugging you did here! It basically looks like I tried to work around systemd-resolvd incorrectly and that's what can cause issues intermittently.

It seems I have two paths to take:

  1. Remove systemd-resolvd and fall back to using good ol' resolv.conf as the source of truth
  2. Make sure systemd-resolvd gets configured correctly

I'll try to make a decision and put in a fix in the coming days/weeks

waltster commented 2 years ago

Hello all,

I was having a huge headache trying to get DetectionLab installed on Ubuntu 20.04.4 LTS today with the same issue and my logger instance would not get past the "Running apt-get update" stage. By changing the instances of "8.8.8.8" to "192.168.56.102", I was able to completely avoid this behavior and initialize the VM.

I am not sure if this helps. I am going to give this installation a few days and verify that everything is working correctly.

rangerrkm commented 2 years ago

Hi Chris,

Let me know if I can help more on this issue.

On Thu, Mar 24, 2022 at 11:07 PM Chris Long @.***> wrote:

Oh man, I've seen this problem appear sporadically and I've never been able to figure out the root cause. Let me see if I can reproduce it.

— Reply to this email directly, view it on GitHub https://github.com/clong/DetectionLab/issues/791#issuecomment-1078659283, or unsubscribe https://github.com/notifications/unsubscribe-auth/AGAI4HMQ7XN2MO7LZ6G56ALVBVCX3ANCNFSM5RI3D3FA . You are receiving this because you commented.Message ID: @.***>

clong commented 2 years ago

@rangerrkm I think the problem should be solved. I resort to just disabling systemd-resolvd and using /etc/resolv.conf now: https://github.com/clong/DetectionLab/blob/master/Vagrant/logger_bootstrap.sh#L12-L18

clong commented 2 years ago

I'm going to mark this as closed since I think the commit referenced in the last comment fixed this. Whatever is in /etc/resolv.conf is now the source of truth for DNS and systemd-resolved is disabled because I don't like useless layers of abstraction