greghesp / ha-bambulab

A Home Assistant Integration for Bambu Lab Printers
870 stars 70 forks source link

[Bug] Home Assistant general connection issue when this integration is installed #462

Closed jjvelar closed 5 months ago

jjvelar commented 8 months ago

Describe the bug

I have Home Assistant running on a dedicated NUC with HAOS, and a Bambu Lab A1 with AMS Lite. After installing Bambu Lab integration, I get Home Assistant connection issues ramdomly after 6 to 24 hours: couldn't access from my local network neither to the HA IP address, nor to .local but Observer showed "healthy" and "connected" status. All green. When restarted HA in safe mode, no connection issues for a couple of days. Restarted in normal mode and start desabling integrations, but the connection issue showed up. Only when it was the turn of Bambu Lab integration and disabled it, got no connection issues while all the other custom integrations were up and running.

To Reproduce

Just install Bambu Lab custom integration, connect your printer, and wait.

Expected Behaviour

No HA connection issues.

What device are you using?

A1

Diagnostic Output

{
  "home_assistant": {
    "installation_type": "Home Assistant OS",
    "version": "2024.2.0",
    "dev": false,
    "hassio": true,
    "virtualenv": false,
    "python_version": "3.12.1",
    "docker": true,
    "arch": "x86_64",
    "timezone": "Europe/Madrid",
    "os_name": "Linux",
    "os_version": "6.1.74-haos",
    "supervisor": "2024.01.1",
    "host_os": "Home Assistant OS 11.5",
    "docker_version": "24.0.7",
    "chassis": "embedded",
    "run_as_root": true
  },
  "custom_components": {
    "edata": {
      "version": "2023.06.3",
      "requirements": [
        "e-data==1.1.5",
        "python-dateutil>=2.8.2"
      ]
    },
    "anniversary": {
      "version": "0.3.0",
      "requirements": []
    },
    "cryptoinfo": {
      "version": "0.1.7",
      "requirements": []
    },
    "alexa_media": {
      "version": "4.9.0",
      "requirements": [
        "alexapy==1.27.10",
        "packaging>=20.3",
        "wrapt>=1.14.0"
      ]
    },
    "teamtracker": {
      "version": "0.1",
      "requirements": [
        "arrow",
        "aiofiles"
      ]
    },
    "personalcapital": {
      "version": "0.1.2",
      "requirements": [
        "personalcapital==1.0.1"
      ]
    },
    "openmediavault": {
      "version": "0.0.0",
      "requirements": []
    },
    "var": {
      "version": "0.15.0",
      "requirements": []
    },
    "kontomierz_sensor": {
      "version": "0.0.1",
      "requirements": []
    },
    "pvpc_hourly_pricing": {
      "version": "1.1.1",
      "requirements": [
        "aiopvpc>=4.2.1"
      ]
    },
    "localtuya": {
      "version": "5.2.1",
      "requirements": []
    },
    "temperature_feels_like": {
      "version": "0.3.8",
      "requirements": [
        "colorlog==6.7.0",
        "ruff==0.1.1"
      ]
    },
    "worlds_air_quality_index": {
      "version": "1.1.0",
      "requirements": []
    },
    "nodered": {
      "version": "3.1.3",
      "requirements": []
    },
    "tplink_deco": {
      "version": "3.6.0",
      "requirements": [
        "pycryptodome>=3.12.0"
      ]
    },
    "huawei_solar": {
      "version": "1.3.1",
      "requirements": [
        "huawei-solar==2.2.9"
      ]
    },
    "smartir": {
      "version": "1.17.9",
      "requirements": [
        "aiofiles>=0.6.0"
      ]
    },
    "scheduler": {
      "version": "v0.0.0",
      "requirements": []
    },
    "tapo_control": {
      "version": "5.4.13",
      "requirements": [
        "pytapo==3.3.18"
      ]
    },
    "yi_hack": {
      "version": "0.3.6",
      "requirements": []
    },
    "awox": {
      "version": "0.1.5",
      "requirements": [
        "pexpect>=4.6.0",
        "pycryptodome>=3.6.6",
        "pygatt[GATTTOOL]>=4.0.5"
      ]
    },
    "bambu_lab": {
      "version": "2.0.15",
      "requirements": []
    },
    "webrtc": {
      "version": "v3.5.1",
      "requirements": []
    },
    "cryptostate": {
      "version": "2.0.0",
      "requirements": []
    },
    "watchman": {
      "version": "0.5.1",
      "requirements": [
        "prettytable==3.0.0"
      ]
    },
    "gas_station_spain": {
      "version": "0.8.0",
      "requirements": []
    },
    "balance_neto": {
      "version": "0.1.0",
      "requirements": []
    },
    "govee": {
      "version": "2023.11.1",
      "requirements": [
        "govee-api-laggat==0.2.2",
        "dacite==1.8.0"
      ]
    },
    "sonoff": {
      "version": "3.5.4",
      "requirements": [
        "pycryptodome>=3.6.6"
      ]
    },
    "hacs": {
      "version": "1.34.0",
      "requirements": [
        "aiogithubapi>=22.10.1"
      ]
    },
    "frigate": {
      "version": "5.0.1",
      "requirements": [
        "pytz==2022.7"
      ]
    },
    "garmin_connect": {
      "version": "0.2.19",
      "requirements": [
        "garminconnect==0.2.12",
        "tzlocal"
      ]
    },
    "spotcast": {
      "version": "v3.6.30",
      "requirements": []
    },
    "delete": {
      "version": "1.8",
      "requirements": []
    },
    "freeds": {
      "version": "0.12.0",
      "requirements": [
        "websockets"
      ]
    },
    "floureon": {
      "version": "1.0.1",
      "requirements": [
        "pythoncrc",
        "broadlink>=0.18.3"
      ]
    }
  },
  "integration_manifest": {
    "domain": "bambu_lab",
    "name": "Bambu Lab",
    "codeowners": [
      "@greghesp",
      "@AdrianGarside"
    ],
    "config_flow": true,
    "dependencies": [
      "device_automation",
      "ffmpeg",
      "mqtt"
    ],
    "documentation": "https://github.com/greghesp/ha-bambulab",
    "iot_class": "local_push",
    "issue_tracker": "https://github.com/greghesp/ha-bambulab/issues",
    "ssdp": [
      {
        "st": "urn:bambulab-com:device:3dprinter:1"
      }
    ],
    "version": "2.0.15",
    "is_built_in": false
  },
  "data": {
    "config_entry": {
      "entry_id": "67ce294e50441fdead4213a030a0046b",
      "version": 2,
      "minor_version": 1,
      "domain": "bambu_lab",
      "title": "**REDACTED**",
      "data": {
        "device_type": "A1",
        "serial": "**REDACTED**"
      },
      "options": {
        "region": "Europe",
        "email": "**REDACTED**",
        "username": "**REDACTED**",
        "name": "Bambu Lab A1 Combo",
        "host": "192.168.1.88",
        "local_mqtt": false,
        "auth_token": "**REDACTED**",
        "access_code": "**REDACTED**",
        "usage_hours": 290.0047222222222
      },
      "pref_disable_new_entities": false,
      "pref_disable_polling": false,
      "source": "user",
      "unique_id": null,
      "disabled_by": null
    },
    "push_all": {
      "ipcam": {
        "ipcam_dev": "1",
        "ipcam_record": "enable",
        "timelapse": "disable",
        "resolution": "1080p",
        "tutk_server": "disable",
        "mode_bits": 3
      },
      "upload": {
        "status": "idle",
        "progress": 0,
        "message": ""
      },
      "nozzle_temper": 23.5625,
      "nozzle_target_temper": 0,
      "bed_temper": 23.40625,
      "bed_target_temper": 0,
      "chamber_temper": 5,
      "mc_print_stage": "1",
      "heatbreak_fan_speed": "0",
      "cooling_fan_speed": "0",
      "big_fan1_speed": "0",
      "big_fan2_speed": "0",
      "mc_percent": 100,
      "mc_remaining_time": 0,
      "ams_status": 0,
      "ams_rfid_status": 0,
      "hw_switch_state": 0,
      "spd_mag": 100,
      "spd_lvl": 2,
      "print_error": 0,
      "lifecycle": "product",
      "wifi_signal": "-33dBm",
      "gcode_state": "FINISH",
      "gcode_file_prepare_percent": "100",
      "queue_number": 0,
      "queue_total": 0,
      "queue_est": 0,
      "queue_sts": 0,
      "project_id": "49298284",
      "profile_id": "48508693",
      "task_id": "96471770",
      "subtask_id": "96471771",
      "subtask_name": "A1 Guia cable Eje X",
      "gcode_file": "",
      "stg": [],
      "stg_cur": 255,
      "print_type": "idle",
      "home_flag": 322454936,
      "mc_print_line_number": "81742",
      "mc_print_sub_stage": 0,
      "sdcard": true,
      "force_upgrade": false,
      "mess_production_state": "active",
      "layer_num": 444,
      "total_layer_num": 444,
      "s_obj": [],
      "filam_bak": [],
      "fan_gear": 0,
      "nozzle_diameter": "0.4",
      "nozzle_type": "stainless_steel",
      "upgrade_state": {
        "sequence_id": 0,
        "progress": "",
        "status": "",
        "consistency_request": false,
        "dis_state": 0,
        "err_code": 0,
        "force_upgrade": false,
        "message": "0%, 0B/s",
        "module": "",
        "new_version_state": 2,
        "cur_state_code": 0,
        "new_ver_list": []
      },
      "hms": [],
      "online": {
        "ahb": false,
        "rfid": false,
        "version": 942176693
      },
      "ams": {
        "ams": [
          {
            "id": "0",
            "humidity": "5",
            "temp": "0.0",
            "tray": [
              {
                "id": "0",
                "remain": 100,
                "k": 0.019999999552965164,
                "n": 1,
                "tag_uid": "0000000000000000",
                "tray_id_name": "",
                "tray_info_idx": "GFL99",
                "tray_type": "PLA",
                "tray_sub_brands": "",
                "tray_color": "010101FF",
                "tray_weight": "0",
                "tray_diameter": "0.00",
                "tray_temp": "0",
                "tray_time": "0",
                "bed_temp_type": "0",
                "bed_temp": "0",
                "nozzle_temp_max": "240",
                "nozzle_temp_min": "190",
                "xcam_info": "000000000000000000000000",
                "tray_uuid": "00000000000000000000000000000000"
              },
              {
                "id": "1",
                "remain": 100,
                "k": 0.019999999552965164,
                "n": 1,
                "tag_uid": "3277907000000100",
                "tray_id_name": "A00-W1",
                "tray_info_idx": "GFA00",
                "tray_type": "PLA",
                "tray_sub_brands": "PLA Basic",
                "tray_color": "FFFFFFFF",
                "tray_weight": "1000",
                "tray_diameter": "1.75",
                "tray_temp": "55",
                "tray_time": "8",
                "bed_temp_type": "1",
                "bed_temp": "35",
                "nozzle_temp_max": "230",
                "nozzle_temp_min": "190",
                "xcam_info": "34218813F401E8030000003F",
                "tray_uuid": "675F3FB8059E4887B3D7EB8ECF854B5D"
              },
              {
                "id": "2",
                "remain": 100,
                "k": 0.019999999552965164,
                "n": 1,
                "tag_uid": "5C5E973900000100",
                "tray_id_name": "A00-B8",
                "tray_info_idx": "GFA00",
                "tray_type": "PLA",
                "tray_sub_brands": "PLA Basic",
                "tray_color": "0086D6FF",
                "tray_weight": "1000",
                "tray_diameter": "1.75",
                "tray_temp": "55",
                "tray_time": "8",
                "bed_temp_type": "1",
                "bed_temp": "35",
                "nozzle_temp_max": "230",
                "nozzle_temp_min": "190",
                "xcam_info": "8813AC0DE803E8039A99193F",
                "tray_uuid": "797878CAF7DB4EA9A1AFB1B83C4E064D"
              },
              {
                "id": "3",
                "remain": 100,
                "k": 0.019999999552965164,
                "n": 1,
                "tag_uid": "0000000000000000",
                "tray_id_name": "",
                "tray_info_idx": "GFL99",
                "tray_type": "PLA",
                "tray_sub_brands": "",
                "tray_color": "C12E1EFF",
                "tray_weight": "0",
                "tray_diameter": "0.00",
                "tray_temp": "0",
                "tray_time": "0",
                "bed_temp_type": "0",
                "bed_temp": "0",
                "nozzle_temp_max": "240",
                "nozzle_temp_min": "190",
                "xcam_info": "000000000000000000000000",
                "tray_uuid": "00000000000000000000000000000000"
              }
            ]
          }
        ],
        "ams_exist_bits": "1",
        "tray_exist_bits": "f",
        "tray_is_bbl_bits": "f",
        "tray_tar": "255",
        "tray_now": "255",
        "tray_pre": "255",
        "tray_read_done_bits": "f",
        "tray_reading_bits": "0",
        "version": 20,
        "insert_flag": true,
        "power_on_flag": true
      },
      "xcam": {
        "buildplate_marker_detector": true
      },
      "vt_tray": {
        "id": "254",
        "tag_uid": "0000000000000000",
        "tray_id_name": "",
        "tray_info_idx": "GFL99",
        "tray_type": "PLA",
        "tray_sub_brands": "",
        "tray_color": "FFF734FF",
        "tray_weight": "0",
        "tray_diameter": "0.00",
        "tray_temp": "0",
        "tray_time": "0",
        "bed_temp_type": "0",
        "bed_temp": "0",
        "nozzle_temp_max": "240",
        "nozzle_temp_min": "190",
        "xcam_info": "000000000000000000000000",
        "tray_uuid": "00000000000000000000000000000000",
        "remain": 0,
        "k": 0.019999999552965164,
        "n": 1
      },
      "lights_report": [
        {
          "node": "chamber_light",
          "mode": "off"
        }
      ],
      "command": "push_status",
      "msg": 0,
      "sequence_id": "22242"
    },
    "get_version": {
      "command": "get_version",
      "sequence_id": "0",
      "module": [
        {
          "name": "ota",
          "project_name": "N2S",
          "sw_ver": "01.02.00.01",
          "hw_ver": "OTA",
          "sn": "**REDACTED**",
          "flag": 3
        },
        {
          "name": "esp32",
          "project_name": "N2S",
          "sw_ver": "01.08.25.64",
          "hw_ver": "AP05",
          "sn": "**REDACTED**",
          "flag": 0
        },
        {
          "name": "mc",
          "project_name": "N2S",
          "sw_ver": "00.00.23.46",
          "loader_ver": "00.00.00.32",
          "hw_ver": "MC02",
          "sn": "**REDACTED**",
          "flag": 0
        },
        {
          "name": "th",
          "project_name": "N2S",
          "sw_ver": "00.00.07.69",
          "loader_ver": "00.00.00.26",
          "hw_ver": "TH01",
          "sn": "**REDACTED**",
          "flag": 0
        },
        {
          "name": "ams_f1/0",
          "project_name": "",
          "sw_ver": "00.00.07.92",
          "loader_ver": "00.00.00.00",
          "ota_ver": "00.00.00.00",
          "hw_ver": "AMS_F102",
          "sn": "**REDACTED**",
          "flag": 0
        }
      ],
      "result": "success",
      "reason": ""
    }
  }
}

Log Extracts

No response

Other Information

No response

AdrianGarside commented 6 months ago

Wow, tons of errors in this log. I do see that home assistant, having hit some errors, starts to shut down and the new code to handle to report this and gracefully handle it runs and closes connection to the printer cleanly without any errors:

2024-03-16 00:14:01.927 INFO (MainThread) [homeassistant.components.automation.automation_123] Notificación - Home Assistant Shutdown: Running automation actions 2024-03-16 00:14:01.927 INFO (MainThread) [homeassistant.components.automation.automation_123] Notificación - Home Assistant Shutdown: Executing step call service 2024-03-16 00:14:01.933 DEBUG (MainThread) [custom_components.bambu_lab] HOME ASSISTANT IS SHUTTING DOWN 2024-03-16 00:14:01.933 DEBUG (MainThread) [custom_components.bambu_lab.pybambu] Disconnect: Client Disconnecting 2024-03-16 00:14:06.726 DEBUG (Thread-6 (mqtt_listen_thread)) [custom_components.bambu_lab.pybambu] Ended listen loop. 2024-03-16 00:14:06.727 INFO (Thread-6 (mqtt_listen_thread)) [custom_components.bambu_lab.pybambu] MQTT listener thread exited.

After that point there are then tons of errors hit the home assistant threads but eventually it shuts down. Prior to the shutdown I see widespread connectivity problems seemingly hitting HA & all your integrations.

Prior to that shutdown I see the last log from the integration was some 7 minutes earlier:

2024-03-16 00:05:04.477 INFO (MainThread) [homeassistant.components.automation.notificacion_bambu_lab_a1_print_progress] Notificación - Bambu Lab A1 Print Progress: Choose at step 3: choice 3: Running automation actions

And before that it seems like some automation kicked in right as the integration lost connection to the printer and a force refresh executed (this isn't something the integration ever does by itself):

2024-03-16 00:04:59.462 WARNING (Thread-6 (mqtt_listen_thread)) [custom_components.bambu_lab.pybambu] On Disconnect: Disconnected from Broker: 16 2024-03-16 00:04:59.462 DEBUG (Thread-6 (mqtt_listen_thread)) [custom_components.bambu_lab] Manually updated bambu_lab data 2024-03-16 00:04:59.463 DEBUG (Thread-6 (mqtt_listen_thread)) [custom_components.bambu_lab.pybambu] PROPERTYCALL: get_hms_errors 2024-03-16 00:04:59.472 INFO (Thread-7) [custom_components.bambu_lab.pybambu] Watchdog thread exited. 2024-03-16 00:04:59.473 INFO (MainThread) [homeassistant.components.automation.notificacion_bambu_lab_a1_print_progress] Notificación - Bambu Lab A1 Print Progress: Running automation actions 2024-03-16 00:04:59.474 INFO (MainThread) [homeassistant.components.automation.notificacion_bambu_lab_a1_print_progress] Notificación - Bambu Lab A1 Print Progress: Executing step call service 2024-03-16 00:04:59.475 DEBUG (MainThread) [custom_components.bambu_lab.pybambu] Force Refresh: Getting Version Info 2024-03-16 00:04:59.475 ERROR (MainThread) [custom_components.bambu_lab.pybambu] Failed to send message to topic device/03919A3A2200275/request 2024-03-16 00:04:59.475 DEBUG (MainThread) [custom_components.bambu_lab.pybambu] Force Refresh: Request Push All 2024-03-16 00:04:59.475 ERROR (MainThread) [custom_components.bambu_lab.pybambu] Failed to send message to topic device/03919A3A2200275/request 2024-03-16 00:04:59.476 INFO (MainThread) [homeassistant.components.automation.notificacion_bambu_lab_a1_print_progress] Notificación - Bambu Lab A1 Print Progress: Executing step delay 0:00:05 2024-03-16 00:05:00.084 INFO (MainThread) [homeassistant.components.automation.inmp441_matrix_on_off] WLED INMP441 Matrix - ON at evening and OFF at night: Running automation actions 2024-03-16 00:05:00.135 INFO (Thread-8) [custom_components.bambu_lab.pybambu] A1: Chamber image thread exited. 2024-03-16 00:05:00.135 WARNING (Thread-6 (mqtt_listen_thread)) [custom_components.bambu_lab.pybambu] On Disconnect: Disconnected from Broker: 16

2024-03-16 00:05:00.387 INFO (MainThread) [homeassistant.components.automation.automation_120] OctoPrint - Nozzle & Bed Temperature Target Alignment: Running automation actions

2024-03-16 00:05:04.477 INFO (MainThread) [homeassistant.components.automation.notificacion_bambu_lab_a1_print_progress] Notificación - Bambu Lab A1 Print Progress: Choose at step 3: choice 3: Running automation actions

And it's from that point onwards that it looks all of your integrations start to have connectivity issues and there's no more bambu lab integration logging until the HA shutdown starts.

Some questions:

TLDR; In this log I see widespread connectivity issues for all your integrations starting around 12.05am. Resulting in tons of errors and then HA attempting to shutdown around 12.14am and seemingly succeeding.

AdrianGarside commented 6 months ago

I have disabled my Node-RED MQTT broker node and enabled again your integration. Let's see what happens. I will keep you updated.

We're still seeing the bambu lab entities being mqtt auto discovered - what's causing that? I'd expect that to add quite a bit of extra (and probably pointless) load to your system.

jjvelar commented 6 months ago

Hi, thanks a lot for taking the time to go to the bottom of this. Much appreciated. Answers to your comments/questions below.

Wow, tons of errors in this log. I do see that home assistant, having hit some errors, starts to shut down and the new code to handle to report this and gracefully handle it runs and closes connection to the printer cleanly without any errors:

2024-03-16 00:05:04.477 INFO (MainThread) [homeassistant.components.automation.notificacion_bambu_lab_a1_print_progress] Notificación - Bambu Lab A1 Print Progress: Choose at step 3: choice 3: Running automation actions

-- That's an automation that is triggered by print status and print_progress sensors status change and send me a Telegram notification.

And before that it seems like some automation kicked in right as the integration lost connection to the printer and a force refresh executed (this isn't something the integration ever does by itself):

-- That same automation presses the force_refresh_data button to make sure I get updated information right before sending the Telegram notification. I have now disabled it.

2024-03-16 00:04:59.462 WARNING (Thread-6 (mqtt_listen_thread)) [custom_components.bambu_lab.pybambu] On Disconnect: Disconnected from Broker: 16 2024-03-16 00:04:59.462 DEBUG (Thread-6 (mqtt_listen_thread)) [custom_components.bambu_lab] Manually updated bambu_lab data 2024-03-16 00:04:59.463 DEBUG (Thread-6 (mqtt_listen_thread)) [custom_components.bambu_lab.pybambu] PROPERTYCALL: get_hms_errors 2024-03-16 00:04:59.472 INFO (Thread-7) [custom_components.bambu_lab.pybambu] Watchdog thread exited. 2024-03-16 00:04:59.473 INFO (MainThread) [homeassistant.components.automation.notificacion_bambu_lab_a1_print_progress] Notificación - Bambu Lab A1 Print Progress: Running automation actions 2024-03-16 00:04:59.474 INFO (MainThread) [homeassistant.components.automation.notificacion_bambu_lab_a1_print_progress] Notificación - Bambu Lab A1 Print Progress: Executing step call service 2024-03-16 00:04:59.475 DEBUG (MainThread) [custom_components.bambu_lab.pybambu] Force Refresh: Getting Version Info 2024-03-16 00:04:59.475 ERROR (MainThread) [custom_components.bambu_lab.pybambu] Failed to send message to topic device/03919A3A2200275/request 2024-03-16 00:04:59.475 DEBUG (MainThread) [custom_components.bambu_lab.pybambu] Force Refresh: Request Push All 2024-03-16 00:04:59.475 ERROR (MainThread) [custom_components.bambu_lab.pybambu] Failed to send message to topic device/03919A3A2200275/request 2024-03-16 00:04:59.476 INFO (MainThread) [homeassistant.components.automation.notificacion_bambu_lab_a1_print_progress] Notificación - Bambu Lab A1 Print Progress: Executing step delay 0:00:05 2024-03-16 00:05:00.084 INFO (MainThread) [homeassistant.components.automation.inmp441_matrix_on_off] WLED INMP441 Matrix - ON at evening and OFF at night: Running automation actions 2024-03-16 00:05:00.135 INFO (Thread-8) [custom_components.bambu_lab.pybambu] A1: Chamber image thread exited. 2024-03-16 00:05:00.135 WARNING (Thread-6 (mqtt_listen_thread)) [custom_components.bambu_lab.pybambu] On Disconnect: Disconnected from Broker: 16

2024-03-16 00:05:00.387 INFO (MainThread) [homeassistant.components.automation.automation_120] OctoPrint - Nozzle & Bed Temperature Target Alignment: Running automation actions

2024-03-16 00:05:04.477 INFO (MainThread) [homeassistant.components.automation.notificacion_bambu_lab_a1_print_progress] Notificación - Bambu Lab A1 Print Progress: Choose at step 3: choice 3: Running automation actions

And it's from that point onwards that it looks all of your integrations start to have connectivity issues and there's no more bambu lab integration logging until the HA shutdown starts.

Some questions:

  • What does your step delay automation do that seems to fire right as things go south.

-- I believe you are referring to the automation that notifies me via Telegram.

  • And what might have triggered the forced refresh?

-- That same automation, but I have disabled it now.

  • And do you know why HA was shutting down in this repro - my bet is still on some kind of attempt to recover from the broken state it's getting into.

-- I am afraid I don't know.

TLDR; In this log I see widespread connectivity issues for all your integrations starting around 12.05am. Resulting in tons of errors and then HA attempting to shutdown around 12.14am and seemingly succeeding.

-- I had yesterday some connectivity issues and had to reboot the router twice, but the previous days I had the same "connection refused" error, local network and router were OK.

jjvelar commented 6 months ago

I have disabled my Node-RED MQTT broker node and enabled again your integration. Let's see what happens. I will keep you updated.

We're still seeing the bambu lab entities being mqtt auto discovered - what's causing that? I'd expect that to add quite a bit of extra (and probably pointless) load to your system.

I really don't understand where they are coming from, and unsure why they keep showing up as I disabled both, the one for the A1 and the one for the AMS. Anyway, I have now deleted them.

I will re-enable your integration after the changes I have done (deleting MQTT auto-discovered A1 and AMS devices, and removing the force_refresh_data button press) and let's see how it goes...

AdrianGarside commented 6 months ago

In the latest logs it appears that the shutdown completed but the preceding errors could mean it only mostly succeeded:

File "/usr/local/lib/python3.12/asyncio/base_events.py", line 544, in _check_default_executor raise RuntimeError('Executor shutdown has been called') RuntimeError: Executor shutdown has been called 2024-03-16 00:17:14.548 ERROR (MainThread) [custom_components.tapo_control] Executor shutdown has been called

The 'mostly' part might be that it didn't restart automatically. Or maybe that would always be the case for this abnormal shutdown.

The force refresh button press isn't needed when you have the configuration running normally (haven't opted into the manual refresh mode). Removing it won't degrade your automation data quality.

jjvelar commented 6 months ago

In the latest logs it appears that the shutdown completed but the preceding errors could mean it only mostly succeeded:

File "/usr/local/lib/python3.12/asyncio/base_events.py", line 544, in _check_default_executor raise RuntimeError('Executor shutdown has been called') RuntimeError: Executor shutdown has been called 2024-03-16 00:17:14.548 ERROR (MainThread) [custom_components.tapo_control] Executor shutdown has been called

The 'mostly' part might be that it didn't restart automatically. Or maybe that would always be the case for this abnormal shutdown.

The force refresh button press isn't needed when you have the configuration running normally (haven't opted into the manual refresh mode). Removing it won't degrade your automation data quality.

all this "default_executor" sounds like Klingon to me :-)

AdrianGarside commented 6 months ago

Re-reading that part of the log I think my initial take was wrong - shutdown has started but I don't think it ever completes.

From the logs I can't see any evidence for the bambu integration causing either the shutdown to start or potentially causing it to fail to complete. But if it is indeed the case that this problem doesn't materialize without the integration installed (vs perhaps just hitting it sooner), then I'd love to know if the shutdown is happening regardless but it successfully auto-restarts and you'd never noticed.

AdrianGarside commented 6 months ago

I have disabled my Node-RED MQTT broker node and enabled again your integration. Let's see what happens. I will keep you updated.

We're still seeing the bambu lab entities being mqtt auto discovered - what's causing that? I'd expect that to add quite a bit of extra (and probably pointless) load to your system.

I really don't understand where they are coming from, and unsure why they keep showing up as I disabled both, the one for the A1 and the one for the AMS. Anyway, I have now deleted them.

I will re-enable your integration after the changes I have done (deleting MQTT auto-discovered A1 and AMS devices, and removing the force_refresh_data button press) and let's see how it goes...

They are from the node red integration. From the developer they will persist until deleted manually: "they can just delete in mqtt explorer everything under A1_A1 and AMS_0_A1"

At this point I don't think that's involved in your issues but it would be useful to do the cleanup to eliminate it as a possibility.

jjvelar commented 6 months ago

I would always noticed as I have Uptime Kuma service in my NAS and Healthcheck service for Home Assiatant and both inform if there is any reboot or disconnection.

El El sáb, 16 mar 2024 a las 20:32, AdrianGarside @.***> escribió:

Re-reading that part of the log I think my initial take was wrong - shutdown has started but I don't think it ever completes.

From the logs I can't see any evidence for the bambu integration causing either the shutdown to start or potentially causing it to fail to complete. But if it is indeed the case that this problem doesn't materialize without the integration installed (vs perhaps just hitting it sooner), then I'd love to know if the shutdown is happening regardless but it successfully auto-restarts and you'd never noticed.

— Reply to this email directly, view it on GitHub https://github.com/greghesp/ha-bambulab/issues/462#issuecomment-2002100438, or unsubscribe https://github.com/notifications/unsubscribe-auth/ALKMG4UHKSJZEC5N6ZMLTDLYYSM6TAVCNFSM6AAAAABDBANARKVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDAMBSGEYDANBTHA . You are receiving this because you authored the thread.Message ID: @.***>

jjvelar commented 6 months ago

This has already been done.

El El sáb, 16 mar 2024 a las 20:40, AdrianGarside @.***> escribió:

I have disabled my Node-RED MQTT broker node and enabled again your integration. Let's see what happens. I will keep you updated.

We're still seeing the bambu lab entities being mqtt auto discovered - what's causing that? I'd expect that to add quite a bit of extra (and probably pointless) load to your system.

I really don't understand where they are coming from, and unsure why they keep showing up as I disabled both, the one for the A1 and the one for the AMS. Anyway, I have now deleted them.

I will re-enable your integration after the changes I have done (deleting MQTT auto-discovered A1 and AMS devices, and removing the force_refresh_data button press) and let's see how it goes...

They are from the node red integration. From the developer they will persist until deleted manually: "they can just delete in mqtt explorer everything under A1_A1 and AMS_0_A1"

At this point I don't think that's involved in your issues but it would be useful to do the cleanup to eliminate it as a possibility.

— Reply to this email directly, view it on GitHub https://github.com/greghesp/ha-bambulab/issues/462#issuecomment-2002102727, or unsubscribe https://github.com/notifications/unsubscribe-auth/ALKMG4XMNNCR2COORHJA4ZTYYSN4RAVCNFSM6AAAAABDBANARKVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDAMBSGEYDENZSG4 . You are receiving this because you authored the thread.Message ID: @.***>

AdrianGarside commented 6 months ago

Comparing your shutdown to a clean shutdown locally I do see that in your case the watchdog and chamber image threads never exit. Those are triggered by the mqtt client reacting to MQTT disconnect via callback that would be into the main thread:

Yours: 2024-03-16 00:14:01.933 DEBUG (MainThread) [custom_components.bambu_lab] HOME ASSISTANT IS SHUTTING DOWN 2024-03-16 00:14:01.933 DEBUG (MainThread) [custom_components.bambu_lab.pybambu] Disconnect: Client Disconnecting 2024-03-16 00:14:06.726 DEBUG (Thread-6 (mqtt_listen_thread)) [custom_components.bambu_lab.pybambu] Ended listen loop. 2024-03-16 00:14:06.727 INFO (Thread-6 (mqtt_listen_thread)) [custom_components.bambu_lab.pybambu] MQTT listener thread exited.

Mine (extra printer type logging added since I have 3):

2024-03-16 15:06:01.613 DEBUG (MainThread) [custom_components.bambu_lab] HOME ASSISTANT IS SHUTTING DOWN 2024-03-16 15:06:01.613 DEBUG (MainThread) [custom_components.bambu_lab.pybambu] P1S Disconnect: Client Disconnecting 2024-03-16 15:06:01.614 WARNING (MainThread) [custom_components.bambu_lab.pybambu] P1S On Disconnect: Disconnected from Broker: 0 2024-03-16 15:06:01.615 INFO (Thread-8) [custom_components.bambu_lab.pybambu] P1S Watchdog thread exited. 2024-03-16 15:06:01.630 INFO (Thread-9) [custom_components.bambu_lab.pybambu] P1S: Chamber image thread exited. 2024-03-16 15:06:01.632 WARNING (Thread-4 (mqtt_listen_thread)) [custom_components.bambu_lab.pybambu] P1S On Disconnect: Disconnected from Broker: 0 2024-03-16 15:06:01.632 DEBUG (Thread-4 (mqtt_listen_thread)) [custom_components.bambu_lab.pybambu] P1S Ended listen loop. 2024-03-16 15:06:01.632 INFO (Thread-4 (mqtt_listen_thread)) [custom_components.bambu_lab.pybambu] P1S MQTT listener thread exited.

So that suggests the callback into the main thread couldn't complete. The Main Thread isn't completely stuck as there's plenty of activity - but almost entirely errors.

Ah no, that extra activity already happened in your case before things went completely south so it couldn't repeat - there was no active connection to the printer at the point the shutdown occurred.

2024-03-16 00:04:59.462 WARNING (Thread-6 (mqtt_listen_thread)) [custom_components.bambu_lab.pybambu] On Disconnect: Disconnected from Broker: 16 2024-03-16 00:04:59.472 INFO (Thread-7) [custom_components.bambu_lab.pybambu] Watchdog thread exited. 2024-03-16 00:04:59.473 INFO (MainThread) [homeassistant.components.automation.notificacion_bambu_lab_a1_print_progress] Notificación - Bambu Lab A1 Print Progress: Running automation actions 2024-03-16 00:04:59.474 INFO (MainThread) [homeassistant.components.automation.notificacion_bambu_lab_a1_print_progress] Notificación - Bambu Lab A1 Print Progress: Executing step call service 2024-03-16 00:04:59.475 DEBUG (MainThread) [custom_components.bambu_lab.pybambu] Force Refresh: Getting Version Info 2024-03-16 00:04:59.475 ERROR (MainThread) [custom_components.bambu_lab.pybambu] Failed to send message to topic device/03919A3A2200275/request 2024-03-16 00:04:59.475 DEBUG (MainThread) [custom_components.bambu_lab.pybambu] Force Refresh: Request Push All 2024-03-16 00:04:59.475 ERROR (MainThread) [custom_components.bambu_lab.pybambu] Failed to send message to topic device/03919A3A2200275/request 2024-03-16 00:04:59.476 INFO (MainThread) [homeassistant.components.automation.notificacion_bambu_lab_a1_print_progress] Notificación - Bambu Lab A1 Print Progress: Executing step delay 0:00:05 2024-03-16 00:05:00.084 INFO (MainThread) [homeassistant.components.automation.inmp441_matrix_on_off] WLED INMP441 Matrix - ON at evening and OFF at night: Running automation actions 2024-03-16 00:05:00.135 INFO (Thread-8) [custom_components.bambu_lab.pybambu] A1: Chamber image thread exited. 2024-03-16 00:05:00.135 WARNING (Thread-6 (mqtt_listen_thread)) [custom_components.bambu_lab.pybambu] On Disconnect: Disconnected from Broker: 16

If your automation just does a force refresh button press, that wouldn't explain the initial disconnect. The force refresh attempt failed in this case because the connection to the printer had just dropped right before that automation fired.

An observation, no idea if related in any way, but all of this started just after a database backup completed:

2024-03-16 00:04:43.489 INFO (MainThread) [homeassistant.components.recorder.backup] Backup end notification, releasing write lock 2024-03-16 00:04:43.489 INFO (Recorder) [homeassistant.components.recorder.core] Database queue backlog reached 2428 entries during backup

AdrianGarside commented 6 months ago

So I'm still stumped. I don't can't find anything that suggests the integration is involved with causing the connectivity problems or HA to start to shutdown. And no evidence the integration doesn't fully clean up when that shutdown occurs. And no evidence it's causing the shutdown to not complete.

AdrianGarside commented 5 months ago

Please try this fix: https://github.com/greghesp/ha-bambulab/releases/tag/v2.0.16-dev4

AdrianGarside commented 5 months ago

Closing this as resolved. Please open a new issue (with debug logs) if you still hit any further instability to help me keep repros prior to this fix vs after.