mbentley / docker-omada-controller

Docker image to run TP-Link Omada Controller
725 stars 108 forks source link

[Bug]: Omada Router fails to provision or adopt recently #490

Open seangreen2 opened 1 week ago

seangreen2 commented 1 week ago

Controller Version

5.14.32.2

Describe the Bug

Omada Router TL-R605 v1.0 (1.3.1 Build 20231207 Rel.61384) fails to provision or adopt.

Expected Behavior

Never had an issue except the past several days where upon a container restart, the router would provision and adopt without issue. Now it always fails and never completes.

Steps to Reproduce

Issue happens on container start, no steps to reproduce

How You're Launching the Container

services:

  omada-controller:

    command:
      - "/usr/bin/java"
      - "-server"
      - "-Xms128m"
      - "-Xmx1024m"
      - "-XX:MaxHeapFreeRatio=60"
      - "-XX:MinHeapFreeRatio=30"
      - "-XX:+HeapDumpOnOutOfMemoryError"
      - "-XX:HeapDumpPath=/opt/tplink/EAPController/logs/java_heapdump.hprof"
      - "-Djava.awt.headless=true"
      - "-cp"
      - "/opt/tplink/EAPController/lib/*::/opt/tplink/EAPController/properties:"
      - "com.tplink.smb.omada.starter.OmadaLinuxMain"

    container_name: "omada-controller"

    entrypoint:
      - "/entrypoint.sh"

    environment:
      - "MANAGE_HTTP_PORT=8088"
      - "MANAGE_HTTPS_PORT=8043"
      - "PORTAL_HTTP_PORT=8088"
      - "PORTAL_HTTPS_PORT=8043"
      - "SHOW_MONGODB_LOGS=false"
      - "SHOW_SERVER_LOGS=true"
      - "PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin"

    hostname: "omada-controller"

    image: "docker.io/mbentley/omada-controller:latest"

    ipc: "private"

    logging:
      driver: "journald"
      options: {}

    network_mode: "bridge"

    ports:
      - "27001:27001/udp"
      - "29810:29810/udp"
      - "29811:29811/tcp"
      - "29812:29812/tcp"
      - "29813:29813/tcp"
      - "29814:29814/tcp"
      - "8043:8043/tcp"
      - "8088:8088/tcp"

    restart: "always"

    volumes:
      - "${DOCKER_VOLUME_STORAGE:-/mnt/docker-volumes}/omada/work:/opt/tplink/EAPController/work"
      - "${DOCKER_VOLUME_STORAGE:-/mnt/docker-volumes}/omada/data:/opt/tplink/EAPController/data"
      - "${DOCKER_VOLUME_STORAGE:-/mnt/docker-volumes}/omada/logs:/opt/tplink/EAPController/logs"

    working_dir: "/opt/tplink/EAPController/lib"

Container Logs

10-18-2024 12:26:27.597 INFO [check-update-work-group-0] [] c.t.s.o.m.c.o(): Start sync Cloud Users OmadacId(1e949824271f495269cbaa64643afd0e).

10-18-2024 12:26:50.564 INFO [server-comm-pool-3] [] c.t.s.e.s.c.h.EcspV2DeviceContextHelper(): device of status ADOPT_SUCCESS time out for 90-9A-4A-CD-49-09

10-18-2024 12:26:50.568 INFO [manage-work-group-3] [] c.t.s.o.m.d.p.t.c(): Device: <90-9A-4A-CD-49-09 OmadacId(1e949824271f495269cbaa64643afd0e)> is AdoptSuccessTimeout.

10-18-2024 12:26:50.568 INFO [manage-work-group-3] [] c.t.s.o.m.d.a.a.a(): OmadacId OmadacId(1e949824271f495269cbaa64643afd0e) Device DeviceMac(90-9A-4A-CD-49-09) is AdoptSuccessTimeout.

10-18-2024 12:26:51.662 INFO [discovery-work-group-1] [] c.t.s.o.m.d.d.m.d.a(): MANAGED_BY_OWN Device 90-9A-4A-CD-49-09 on omadac 1e949824271f495269cbaa64643afd0e is discovered.

10-18-2024 12:26:55.273 INFO [email-log-event-pool-0] [] c.t.s.o.l.d.m.l.U(): omadacId=1e949824271f495269cbaa64643afd0e:no alert users, stop sending alert email

10-18-2024 12:26:55.275 INFO [email-log-event-pool-0] [] c.t.s.o.l.d.m.l.U(): omadacId=1e949824271f495269cbaa64643afd0e:no alert users, stop sending alert email

10-18-2024 12:26:55.303 INFO [adopt-work-group-6] [] c.t.s.o.m.d.d.m.a.a(): start watching full adopt result of omadacId:OmadacId(1e949824271f495269cbaa64643afd0e) mac:DeviceMac(90-9A-4A-CD-49-09), businessId:BusinessId(topic=smb_omada_business_f9d75875-1dc5-46ae-9907-1c550cf3031c, id=024a8abb-fe37-4480-be15-17ac4317592e)

10-18-2024 12:26:55.317 INFO [adopt-work-group-6] [] c.t.s.o.m.d.d.m.d.o.h(): Gateway OmadacId(1e949824271f495269cbaa64643afd0e) SiteId(Default) DeviceMac(90-9A-4A-CD-49-09) adopt[auto=true] ok

10-18-2024 12:26:55.436 INFO [adopt-work-group-6] [] c.t.s.o.m.d.d.m.a.e(): send mini setting to OmadacId(1e949824271f495269cbaa64643afd0e) DeviceMac(90-9A-4A-CD-49-09)

10-18-2024 12:26:55.436 INFO [manage-work-group-6] [] c.t.s.o.m.d.a.u.f.b.c(): receive adopt success event for TargetFirmwareOnDeviceAdoptSuccessSubscriber: TargetFirmwareEvent(super=AbstractDomainEvent(id=ee6e41bf3d414913bd0895d14b90f46c, createTime=Fri Oct 18 12:26:55 UTC 2024, type=null), omadacId=OmadacId(1e949824271f495269cbaa64643afd0e), siteId=SiteId(Default), deviceMac=DeviceMac(90-9A-4A-CD-49-09), compoundModel=TL-R605 v1.0, version=1.3.1, timeStamp=1729254415436)

10-18-2024 12:27:26.625 INFO [manager-topology-pool-1] [] c.t.s.o.m.t.p.s.t.TopologyTask(): site=Default topology calculate cost: 3ms

10-18-2024 12:27:26.639 INFO [manager-topology-pool-1] [] c.t.s.o.m.d.d.m.m.s.e(): send setMsg to OmadacId(1e949824271f495269cbaa64643afd0e) DeviceMac(90-9A-4A-CD-49-09), status is PROVISIONING, need sync full config, ignore

10-18-2024 12:27:55.419 INFO [discovery-work-group-0] [] c.t.s.o.m.d.d.m.d.a(): MANAGED_BY_OWN Device 90-9A-4A-CD-49-09 on omadac 1e949824271f495269cbaa64643afd0e is discovered.

10-18-2024 12:28:02.982 INFO [adopt-work-group-2] [] c.t.s.o.m.d.d.m.a.a(): start watching full adopt result of omadacId:OmadacId(1e949824271f495269cbaa64643afd0e) mac:DeviceMac(90-9A-4A-CD-49-09), businessId:BusinessId(topic=smb_omada_business_f9d75875-1dc5-46ae-9907-1c550cf3031c, id=47396a2b-c19c-419b-8245-998803c0eb66)

10-18-2024 12:28:02.999 INFO [adopt-work-group-3] [] c.t.s.o.m.d.d.m.d.o.h(): Gateway OmadacId(1e949824271f495269cbaa64643afd0e) SiteId(Default) DeviceMac(90-9A-4A-CD-49-09) adopt[auto=true] ok

10-18-2024 12:28:03.080 INFO [manage-work-group-10] [] c.t.s.o.m.d.a.u.f.b.c(): receive adopt success event for TargetFirmwareOnDeviceAdoptSuccessSubscriber: TargetFirmwareEvent(super=AbstractDomainEvent(id=c3b5bf1907c144a8b08e1dce7aba8749, createTime=Fri Oct 18 12:28:03 UTC 2024, type=null), omadacId=OmadacId(1e949824271f495269cbaa64643afd0e), siteId=SiteId(Default), deviceMac=DeviceMac(90-9A-4A-CD-49-09), compoundModel=TL-R605 v1.0, version=1.3.1, timeStamp=1729254483080)

10-18-2024 12:28:03.080 INFO [adopt-work-group-3] [] c.t.s.o.m.d.d.m.a.e(): send mini setting to OmadacId(1e949824271f495269cbaa64643afd0e) DeviceMac(90-9A-4A-CD-49-09)

10-18-2024 12:29:10.049 WARN [discovery-work-group-1] [] c.t.s.o.m.d.p.t.e(): disconnectDevice OmadacId(1e949824271f495269cbaa64643afd0e) DeviceMac(90-9A-4A-CD-49-09) failed, com.tplink.smb.ecsp.common.TransResult@77932a81[addressDTO=<null>,errCode=9500,msg=ERR_CLOSE_CHANNEL_FAILED,result=9500]

10-18-2024 12:29:10.055 INFO [discovery-work-group-1] [] c.t.s.o.m.d.d.m.d.a(): MANAGED_BY_OWN Device 90-9A-4A-CD-49-09 on omadac 1e949824271f495269cbaa64643afd0e is discovered.

10-18-2024 12:29:13.721 INFO [adopt-work-group-1] [] c.t.s.o.m.d.d.m.a.a(): start watching full adopt result of omadacId:OmadacId(1e949824271f495269cbaa64643afd0e) mac:DeviceMac(90-9A-4A-CD-49-09), businessId:BusinessId(topic=smb_omada_business_f9d75875-1dc5-46ae-9907-1c550cf3031c, id=f584cb2d-0279-471a-bccd-59a2b7fc6638)

10-18-2024 12:29:13.736 INFO [adopt-work-group-1] [] c.t.s.o.m.d.d.m.d.o.h(): Gateway OmadacId(1e949824271f495269cbaa64643afd0e) SiteId(Default) DeviceMac(90-9A-4A-CD-49-09) adopt[auto=true] ok

10-18-2024 12:29:13.807 INFO [adopt-work-group-1] [] c.t.s.o.m.d.d.m.a.e(): send mini setting to OmadacId(1e949824271f495269cbaa64643afd0e) DeviceMac(90-9A-4A-CD-49-09)

10-18-2024 12:29:13.807 INFO [manage-work-group-13] [] c.t.s.o.m.d.a.u.f.b.c(): receive adopt success event for TargetFirmwareOnDeviceAdoptSuccessSubscriber: TargetFirmwareEvent(super=AbstractDomainEvent(id=398be120118243e1aae9bd49ec374876, createTime=Fri Oct 18 12:29:13 UTC 2024, type=null), omadacId=OmadacId(1e949824271f495269cbaa64643afd0e), siteId=SiteId(Default), deviceMac=DeviceMac(90-9A-4A-CD-49-09), compoundModel=TL-R605 v1.0, version=1.3.1, timeStamp=1729254553807)

10-18-2024 12:29:26.591 INFO [manager-topology-pool-3] [] c.t.s.o.c.u.e.a(): list local interface macs: [02-42-AC-11-00-04]

10-18-2024 12:29:26.591 INFO [manager-topology-pool-3] [] c.t.s.o.m.t.p.s.t.TopologyTask(): site=Default topology calculate cost: 0ms

10-18-2024 12:29:26.597 INFO [manager-topology-pool-3] [] c.t.s.o.m.d.d.m.m.s.e(): send setMsg to OmadacId(1e949824271f495269cbaa64643afd0e) DeviceMac(90-9A-4A-CD-49-09), status is PROVISIONING, need sync full config, ignore

10-18-2024 12:30:16.696 INFO [discovery-work-group-0] [] c.t.s.o.m.d.d.m.d.a(): MANAGED_BY_OWN Device 90-9A-4A-CD-49-09 on omadac 1e949824271f495269cbaa64643afd0e is discovered.

10-18-2024 12:30:22.382 INFO [app-push-scheduler-0] [] c.t.s.o.l.c.a.a(): Start pushing connected devices.

10-18-2024 12:30:22.519 INFO [app-push-scheduler-0] [] c.t.s.o.l.c.a.a(): Start pushing disconnected devices.

10-18-2024 12:30:24.417 INFO [adopt-work-group-4] [] c.t.s.o.m.d.d.m.a.a(): start watching full adopt result of omadacId:OmadacId(1e949824271f495269cbaa64643afd0e) mac:DeviceMac(90-9A-4A-CD-49-09), businessId:BusinessId(topic=smb_omada_business_f9d75875-1dc5-46ae-9907-1c550cf3031c, id=09b1f3ed-47f7-4a2a-a99d-fbc16d497fb4)

10-18-2024 12:30:24.432 INFO [adopt-work-group-5] [] c.t.s.o.m.d.d.m.d.o.h(): Gateway OmadacId(1e949824271f495269cbaa64643afd0e) SiteId(Default) DeviceMac(90-9A-4A-CD-49-09) adopt[auto=true] ok

10-18-2024 12:30:24.501 INFO [manage-work-group-0] [] c.t.s.o.m.d.a.u.f.b.c(): receive adopt success event for TargetFirmwareOnDeviceAdoptSuccessSubscriber: TargetFirmwareEvent(super=AbstractDomainEvent(id=f864995db53d40cf90071b38bd44a0fe, createTime=Fri Oct 18 12:30:24 UTC 2024, type=null), omadacId=OmadacId(1e949824271f495269cbaa64643afd0e), siteId=SiteId(Default), deviceMac=DeviceMac(90-9A-4A-CD-49-09), compoundModel=TL-R605 v1.0, version=1.3.1, timeStamp=1729254624501)

10-18-2024 12:30:24.501 INFO [adopt-work-group-5] [] c.t.s.o.m.d.d.m.a.e(): send mini setting to OmadacId(1e949824271f495269cbaa64643afd0e) DeviceMac(90-9A-4A-CD-49-09)

10-18-2024 12:30:26.587 WARN [quartzScheduler_Worker-2] [] c.t.s.c.s.c.TaskExecutorService(): receive scheduled event with identity (log_device_status_push, null) but did not execute because corresponding handler log_device_status_push doesn't exist

10-18-2024 12:30:43.275 WARN [async-business-timeout-pool-0] [] c.t.s.o.m.d.d.m.a.a(): timeout wait adopt watch result of OmadacId(1e949824271f495269cbaa64643afd0e) DeviceMac(90-9A-4A-CD-49-09) BusinessId(topic=smb_omada_business_f9d75875-1dc5-46ae-9907-1c550cf3031c, id=cca51452-9757-4c83-abbc-5a2b1c6841c8), response:4

10-18-2024 12:30:56.600 INFO [comm-pool-14] [] c.t.s.o.m.d.a.u.f.c(): Target firmware check task run

10-18-2024 12:30:56.601 INFO [comm-pool-14] [] c.t.s.o.m.d.a.u.f.c(): no omadac need handle target firmware upgrade

MongoDB Logs

No response

Additional Context

No response

mbentley commented 1 week ago

I only have one suggestion - set stop_grace_period: 60s in your compose file. I don't think that a forceful shutdown of the controller would cause an issue as this is mostly to prevent database corruption but it is worth trying. See this part of the README for more info.

If that doesn't help, I would suggest creating a post on the community forums or filing a support case as this is most likely a problem with the software and not related to the packaging in a container.