linuxserver / docker-unifi-network-application

GNU General Public License v3.0
551 stars 40 forks source link

[BUG] Inform URL rejected #77

Closed mulbc closed 1 day ago

mulbc commented 3 months ago

Is there an existing issue for this?

Current Behavior

When trying to adopt an existing AP, I try to use the manual adoption method via:

set-inform http://{IP}:8080/inform

After issuing the command I execute info and see that the URL is Rejected:

UAP-AC-Pro-Gen2-BZ.6.6.55# info

Model:       UAP-AC-Pro-Gen2
Version:     6.6.55.15189
MAC Address: 78:8a:20:[xxx]
IP Address:  172.[xxx]
Hostname:    UAP-AC-Pro-Gen2
Uptime:      7914085 seconds
NTP:         Synchronized

Status:      Server Reject (http://[redacted]:8080/inform)

When I try curl, I see that I get a HTTP Error 400:

UAP-AC-Pro-Gen2-BZ.6.6.55# curl -v http://{IP}:8080/inform
> GET /inform HTTP/1.1
> Host: 172.16.1.103:8080
> User-Agent: curl/8.4.0
> Accept: */*
> 
< HTTP/1.1 400 
< Content-Length: 0
< Date: Wed, 20 Mar 2024 16:28:48 GMT
< Connection: close
< 

I observe the same behavior on another computer and from within the container.

Needless to say - the AP is not discovered in the Unifi Console

Expected Behavior

The status does not show rejected and the console will allow adoption of the AP

Steps To Reproduce

  1. Default install - no config needed

Environment

- OS: Fedora 39
- How docker service was installed: Podman from Fedora

CPU architecture

x86-64

Docker creation

podman run \        
  -d \
  --name=unifi-network-application \
  -e PUID=1000 \                    
  -e PGID=1000 \                                                               
  -e TZ=Europe/Berlin \     
  -e MONGO_USER=unifi \                        
  -e MONGO_PASS=[redacted] \
  -e MONGO_HOST=unifi-db \
  -e MONGO_PORT=27017 \  
  -e MONGO_DBNAME=unifi \
  -p 8443:8443 \
  -p 3478:3478/udp \
  -p 10001:10001/udp \
  -p 8080:8080 \
  -p 1900:1900/udp `#optional` \
  -p 8843:8843 `#optional` \
  -p 8880:8880 `#optional` \
  -p 6789:6789 `#optional` \
  -p 5514:5514/udp `#optional` \
  -v /srv/unifi:/config:Z \
  --restart unless-stopped \
  --label "io.containers.autoupdate=registry" \
  --network cblum_default \
  lscr.io/linuxserver/unifi-network-application:latest

Container logs

#  podman logs unifi-network-application                         
[migrations] started
[migrations] no migrations found
───────────────────────────────────────

      ██╗     ███████╗██╗ ██████╗
      ██║     ██╔════╝██║██╔═══██╗
      ██║     ███████╗██║██║   ██║
      ██║     ╚════██║██║██║   ██║
      ███████╗███████║██║╚██████╔╝
      ╚══════╝╚══════╝╚═╝ ╚═════╝

   Brought to you by linuxserver.io
───────────────────────────────────────

To support LSIO projects visit:
https://www.linuxserver.io/donate/

───────────────────────────────────────
GID/UID
───────────────────────────────────────

User UID:    1000
User GID:    1000
───────────────────────────────────────

[custom-init] No custom files found, skipping...
[ls.io-init] done.
github-actions[bot] commented 3 months ago

Thanks for opening your first issue here! Be sure to follow the relevant issue templates, or risk having this issue marked as invalid.

LinuxServer-CI commented 2 months ago

This issue has been automatically marked as stale because it has not had recent activity. This might be due to missing feedback from OP. It will be closed if no further activity occurs. Thank you for your contributions.

moritzbeck13 commented 2 months ago

Same here

LinuxServer-CI commented 1 month ago

This issue has been automatically marked as stale because it has not had recent activity. This might be due to missing feedback from OP. It will be closed if no further activity occurs. Thank you for your contributions.

drizuid commented 1 day ago

often, assuming correct info and mapped ports, you just need a factory reset. However, in your cased, based on

UAP-AC-Pro-Gen2-BZ.6.6.55# curl -v http://{IP}:8080/inform

GET /inform HTTP/1.1 Host: 172.16.1.103:8080

it looks like you are inputting the container ip in your inform address when it should be the docker host ip. Please confirm that you are NOT using the docker container ip, which your access point would not normally have knowledge of. Additionally, if you are using 172.16 as your LAN subnet you will want to ensure your docker bridge is not conflicting with it.

Finally, we do not support or test podman and we definitely do not support rootless.

mulbc commented 1 day ago
  1. My lan is 172.16.1.0/24, which does not conflict with my container IP space (10. something)
  2. The target IP used is the host IP with the forwarded port of the container - that's standard to reach services inside of the container from outside the host
  3. I understand that you don't support Podman, feel free to close the issue if you feel certain this is Podman specific (I doubt this)
  4. The container in question runs "rootful" and I did not spot any selinux issues

As far as I can see, the connection from the AP to inside the container actually works, but something inside the container failed. Maybe :8080 is the wrong port or we need some other path than /inform?

drizuid commented 1 day ago
  1. My lan is 172.16.1.0/24, which does not conflict with my container IP space (10. something)
  2. The target IP used is the host IP with the forwarded port of the container - that's standard to reach services inside of the container from outside the host
  3. I understand that you don't support Podman, feel free to close the issue if you feel certain this is Podman specific (I doubt this)
  4. The container in question runs "rootful" and I did not spot any selinux issues

As far as I can see, the connection from the AP to inside the container actually works, but something inside the container failed. Maybe :8080 is the wrong port or we need some other path than /inform?

This all sounds good, 8080/inform is correct

Have you tried backing up your current config (do you have ANY working devices?) and doing a fresh db+unifi to test? I would test first without restoring any data, just brand-new empty and do a factory reset on your devices before adoption. If that works, then i would restore your data in a fresh setup, do a factory reset, and adopt. if that works, i would keep that instance and blow away the other. Failing those tests, i would test with docker (in a vm if needed) to rule out podman. You are correct that podman SHOULDN'T be the issue, however, it wouldn't be the first time we've seen it.

--Great note from moritzbeck, however, my parents and in-laws both use my unifi controller and are NOT on my L3 subnet, while my local unifi APs ARE on the same l3 subnet. They are on completely different subnets and reach my controller via site to site vpn and have no issues and neither do I. I will note, we will not support macvlan or ipvlan on our containers, while users can make it work, we don't test like that and it's out of scope for us.

moritzbeck13 commented 1 day ago

Are your network devices on the same L3 subnet as your Docker host? I had an issue with that setup, because the network devices seemed to recognize this and only try to use L2 adoption. This does not work with the Docker NAT/bridge networking, you would need to use a MACVLAN network for that. After I put the Docker host on a different subnet, the Unifi devices tried L3 adoption and it worked.

mulbc commented 1 day ago

Note: I opened this issue in March and since there was no response, I adopted the AP by resetting it with a pin. That worked well and all my issues are gone. Nevertheless, I have the feeling that something in the container doesn't work and I'm open to helping you find what it is if you want.

The container host and the AP are in the same L2 network. Container network configuration is as follows:

$ podman network inspect cblum_default 
[
     {
          "name": "cblum_default",
          "id": "4cfd6ca8a7647aa4928b30fbcb4ffa3a5689ff4e566f828f8a2cfe35eee7a74d",
          "driver": "bridge",
          "network_interface": "podman1",
          "created": "2024-03-18T14:15:41.914554496+01:00",
          "subnets": [
               {
                    "subnet": "10.89.0.0/24",
                    "gateway": "10.89.0.1"
               }
          ],
          "ipv6_enabled": false,
          "internal": false,
          "dns_enabled": true,
          "labels": {
               "com.docker.compose.project": "cblum",
               "io.podman.compose.project": "cblum"
          },
          "ipam_options": {
               "driver": "host-local"
          }
     }
]

So maybe this is due to using the podman bridge network? Is there anywhere more information on the Docker issue with bridge/NAT?

thespad commented 1 day ago

The container works fine with a bridge network, however the Unifi hardware is frequently wonky when it comes to adoption, especially if it's previously been adopted by a different controller or using a different inform address.

More than once I've had to factory reset an AP to get it to adopt where there's no good reason it shouldn't just work.

mulbc commented 1 day ago

Nevertheless... when I do curl -v http://{IP}:8080/inform - is it expected I get a HTTP 400 as return code? From the debugging I did, it felt like that was the issue that blocked the AP adoption

thespad commented 1 day ago

Yes, a 400 error code is expected if you're not sending a valid inform payload

mulbc commented 1 day ago

can you give me an example for a valid inform payload?

moritzbeck13 commented 1 day ago

See https://ubntwiki.com/products/software/unifi-controller/api

moritzbeck13 commented 1 day ago

The container works fine with a bridge network

Yes, but as said from my experience and apparently also as described itt, only on different L3 subnets.

moritzbeck13 commented 1 day ago

Is there anywhere more information on the Docker issue with bridge/NAT?

I found out using Wireshark that the network devices were sending some other protocol I can't remember in the L2 frame and not IP, so it couldn't traverse NAT.

thespad commented 1 day ago

I've not had any* issues with adoption of devices on the same subnet of the host and I've been using variations of our container for the best part of 5 years. That said I can't speak to anyone elses network configuration so YMMV.

moritzbeck13 commented 1 day ago

@thespad Can you maybe share your Docker Compose file?

mulbc commented 1 day ago

You are making me curious with the network talk and I did a quick tcpdump experiment.

Experiment: Run tcpdump on the host with the controller while doing a set-inform on an AP. (The AP in question is happily adopted, but will try to do the inform with the host anyways)

Result:

# tcpdump -nn -i eno1 -vvv host unifi-ap    
dropped privs to tcpdump
tcpdump: listening on eno1, link-type EN10MB (Ethernet), snapshot length 262144 bytes

14:26:58.514216 IP (tos 0x0, ttl 64, id 63429, offset 0, flags [DF], proto TCP (6), length 52)
    unifi-ap.38134 > unifi-host.8080: Flags [S], cksum 0x5b7c (correct), seq 4033913308, win 29200, options [mss 1460,nop,nop,sackOK,nop,wscale 6], length 0
14:26:58.514527 IP (tos 0x0, ttl 63, id 0, offset 0, flags [DF], proto TCP (6), length 52)
    unifi-host.8080 > unifi-ap.38134: Flags [S.], cksum 0x5ad8 (incorrect -> 0x3157), seq 3465663136, ack 4033913309, win 64240, options [mss 1460,nop,nop,sackOK,nop,wscale 7], length 0
14:26:58.515339 IP (tos 0x0, ttl 64, id 63430, offset 0, flags [DF], proto TCP (6), length 40)
    unifi-ap.38134 > unifi-host.8080: Flags [.], cksum 0x6b51 (correct), seq 1, ack 1, win 457, length 0
14:26:58.515339 IP (tos 0x0, ttl 64, id 63431, offset 0, flags [DF], proto TCP (6), length 5911)
    unifi-ap.38134 > unifi-host.8080: Flags [P.], cksum 0x71bb (incorrect -> 0x3f02), seq 1:5872, ack 1, win 457, length 5871: HTTP, length: 5871
    POST /inform HTTP/1.1
    Host: unifi-host:8080
    Accept: */*
    User-Agent: AirControl Agent v1.0
    Content-Type: application/x-binary
    Content-Length: 5715

14:26:58.515645 IP (tos 0x0, ttl 63, id 65089, offset 0, flags [DF], proto TCP (6), length 40)
    unifi-host.8080 > unifi-ap.38134: Flags [.], cksum 0x5acc (incorrect -> 0x5436), seq 1, ack 5872, win 501, length 0
14:26:58.521689 IP (tos 0x0, ttl 63, id 65090, offset 0, flags [DF], proto TCP (6), length 498)
    unifi-host.8080 > unifi-ap.38134: Flags [P.], cksum 0x5c96 (incorrect -> 0x8b39), seq 1:459, ack 5872, win 501, length 458: HTTP, length: 458
    HTTP/1.1 200 
    Content-Type: application/x-binary
    Content-Length: 347
    Date: Tue, 25 Jun 2024 12:26:57 GMT

14:26:58.522080 IP (tos 0x0, ttl 64, id 63436, offset 0, flags [DF], proto TCP (6), length 40)
    unifi-ap.38134 > unifi-host.8080: Flags [.], cksum 0x5288 (correct), seq 5872, ack 459, win 473, length 0
14:26:58.522309 IP (tos 0x0, ttl 64, id 63437, offset 0, flags [DF], proto TCP (6), length 40)
    unifi-ap.38134 > unifi-host.8080: Flags [F.], cksum 0x5287 (correct), seq 5872, ack 459, win 473, length 0
14:26:58.522508 IP (tos 0x0, ttl 63, id 65091, offset 0, flags [DF], proto TCP (6), length 40)
    unifi-host.8080 > unifi-ap.38134: Flags [F.], cksum 0x5acc (incorrect -> 0x526a), seq 459, ack 5873, win 501, length 0
14:26:58.522839 IP (tos 0x0, ttl 64, id 63438, offset 0, flags [DF], proto TCP (6), length 40)
    unifi-ap.38134 > unifi-host.8080: Flags [.], cksum 0x5286 (correct), seq 5873, ack 460, win 473, length 0
14:27:05.603128 IP (tos 0xc0, ttl 1, id 0, offset 0, flags [DF], proto IGMP (2), length 48, options (RA))
    unifi-ap > 224.0.0.22: igmp v3 report, 2 group record(s) [gaddr 233.89.188.1 is_ex { }] [gaddr 239.254.127.63 is_ex { }]

So I do see a HTTP 200 result from the unifi-host to the AP after the inform! Additionally I see that igmp is used by the AP - that would most probably not work with NAT and is probably what was discussed previously

drizuid commented 1 day ago

To add to this, I use the compose listed in the container readme on this repo with path and uid changes. My devices adopt fine (as long as they have never been adopted before, in which case i factory reset them), on the same l3 network as my docker host, and also from 2 networks accessible over site to site vpn. This is all on a custom bridge.

In terms of the multicast, this will not reach the controller because of docker nat, but it will not prevent adoption (as you noted after a factory reset) , it will also not cross a layer 3 boundary but still poses no issue.

It's worth noting that unifi is.. bad, they know they're bad, it's why even they suggest a factory reset if something doesnt work stance - https://help.ui.com/hc/en-us/articles/360012622613-UniFi-Device-Adoption

moritzbeck13 commented 1 day ago

So maybe this has more to do with switching between layer 2 and 3 adoption, because Unifi devices are dumb and won't change their behaviour?

drizuid commented 1 day ago

So maybe this has more to do with switching between layer 2 and 3 adoption, because Unifi devices are dumb and won't change their behaviour?

that very well could be the situation; in any case, im not sure this is a container issue. I think continued discussion is more appropriate in discord as the discussed issue is not something linuxserver.io can resolve.

mulbc commented 1 day ago

closing since this appears to not be related to the container and an easy workaround (resetting the AP) exists