thingsboard / thingsboard-edge

Apache License 2.0
93 stars 71 forks source link

[Bug] After migrating from v3.5.1.1 to 3.6.0 edge fails to start #77

Open AndreMaz opened 10 months ago

AndreMaz commented 10 months ago

Describe the bug

After migrating the TB-Cloud to v3.6.0 (went fine, no errors in the logs) an then migrating the TB-Edge to v3.6.0 I started to see the following error in the logs:

 2023-09-24 14:04:05,945 [SpringApplicationShutdownHook] INFO  o.t.s.s.q.DefaultTbRuleEngineConsumerService - [SequentialByOriginator] Removing consumer for topic: TopicPartitionInfo(topic=tb_rule_engine.sq, tenantId=Optional[13814000-1dd2-11b2-8080], partition=Optional[3], fullTopicName=tb_rule_engine.sq.3, myPartition=true)
Exception in thread "ts-service-ts-callback-25-thread-1" java.util.concurrent.RejectedExecutionException
    at java.base/java.util.concurrent.ForkJoinPool.externalPush(ForkJoinPool.java:1880)
    at java.base/java.util.concurrent.ForkJoinPool.externalSubmit(ForkJoinPool.java:1921)
    at java.base/java.util.concurrent.ForkJoinPool.execute(ForkJoinPool.java:2453)
    at org.thingsboard.server.actors.TbActorMailbox.tryProcessQueue(TbActorMailbox.java:150)
    at org.thingsboard.server.actors.TbActorMailbox.enqueue(TbActorMailbox.java:128)
    at org.thingsboard.server.actors.TbActorMailbox.tell(TbActorMailbox.java:265)
    at org.thingsboard.server.actors.ruleChain.DefaultTbContext.tellNext(DefaultTbContext.java:193)
    at org.thingsboard.server.actors.ruleChain.DefaultTbContext.tellSuccess(DefaultTbContext.java:175)
    at org.thingsboard.rule.engine.telemetry.TelemetryNodeCallback.onSuccess(TelemetryNodeCallback.java:50)
    at org.thingsboard.rule.engine.telemetry.TelemetryNodeCallback.onSuccess(TelemetryNodeCallback.java:43)
    at org.thingsboard.server.service.telemetry.DefaultTelemetrySubscriptionService$4.onSuccess(DefaultTelemetrySubscriptionService.java:441)
    at com.google.common.util.concurrent.Futures$CallbackListener.run(Futures.java:1138)
    at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
    at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
    at java.base/java.lang.Thread.run(Thread.java:829)
2023-09-24 14:04:07,463 [SpringApplicationShutdownHook] INFO  o.t.s.a.service.DefaultActorService - Actor system stopped.
2023-09-24 14:04:07,498 [sql-queue-0-cloud events-23-thread-1] INFO  o.t.s.dao.sql.TbSqlBlockingQueue - [Cloud Events] Queue polling was interrupted

Migration logs show that everything is fine

Starting ThingsBoard Edge upgrade ...
  ______    __      _                              ____                               __
 /_  __/   / /_    (_)   ____    ____ _   _____   / __ )  ____   ____ _   _____  ____/ /
  / /     / __ \  / /   / __ \  / __ `/  / ___/  / __  | / __ \ / __ `/  / ___/ / __  /
 / /     / / / / / /   / / / / / /_/ /  (__  )  / /_/ / / /_/ // /_/ /  / /    / /_/ /
/_/     /_/ /_/ /_/   /_/ /_/  \__, /  /____/  /_____/  \____/ \__,_/  /_/     \__,_/
                              /____/

 ===================================================
 :: ThingsBoard Edge PE ::       (v3.6.0EDGEPE)
 ===================================================

Starting ThingsBoard Edge Upgrade from version 3.5.1 ...
Upgrading ThingsBoard from version 3.5.1 to 3.6.0 ...
Updating schema ...
relation "idx_edge_event_id" already exists, skipping
Schema updated to version 3.6.0.
Updating data from version 3.5.1 to 3.6.0 ...
Integration rate limits updater: 0 total entities updated.
Starting edge events migration - adding seq_id column. Can be skipped with TB_SKIP_EDGE_EVENTS_MIGRATION env variable set to true
Tenants edge full sync required updater: 1 total entities updated.
Updating schema ...
relation "entity_group" already exists, skipping
relation "converter" already exists, skipping
relation "integration" already exists, skipping
relation "scheduler_event" already exists, skipping
relation "blob_entity" already exists, skipping
relation "role" already exists, skipping
relation "group_permission" already exists, skipping
relation "device_group_ota_package" already exists, skipping
relation "converter_debug_event" already exists, skipping
relation "integration_debug_event" already exists, skipping
relation "raw_data_event" already exists, skipping
relation "white_labeling" already exists, skipping
relation "idx_entity_group_by_type_name_and_owner_id" already exists, skipping
relation "idx_converter_external_id" already exists, skipping
relation "idx_integration_external_id" already exists, skipping
relation "idx_role_external_id" already exists, skipping
relation "idx_entity_group_external_id" already exists, skipping
relation "idx_converter_debug_event_main" already exists, skipping
relation "idx_integration_debug_event_main" already exists, skipping
relation "idx_raw_data_event_main" already exists, skipping
Schema updated.
Installing SQL DataBase schema views and functions: schema-views-and-functions.sql
Successfully executed query: DROP VIEW IF EXISTS device_info_view CASCADE;
Successfully executed query: CREATE OR REPLACE VIEW device_info_view AS SELECT * FROM device_info_active_attribute_view;
Updating data ...
Upgrade finished successfully!
volodymyr-babak commented 10 months ago

@AndreMaz

Hello, could you please attach the complete TB Edge log file starting from the application's initiation? Additionally, could you please check PostgreSQL logs for any erros?

AndreMaz commented 10 months ago

Hi @volodymyr-babak sorry for the delay.

Checked the logs again and found this:

2023-09-27 08:25:40,443 [grpc-default-executor-0] ERROR o.t.license.client.TbLicenseClient - License Error: ACTIVE_INSTANCES_CAPACITY_EXCEEDED(104) - Active instances capacity exceeded!
2023-09-27 08:25:40,443 [grpc-default-executor-0] ERROR o.t.license.client.TbLicenseClient - Failed to initialize ThingsBoard License Client!
2023-09-27 08:25:40,448 [grpc-default-executor-0] ERROR o.t.s.d.s.BasicSubscriptionService - Failed to init license client
org.thingsboard.license.shared.exception.LicenseException: Active instances capacity exceeded!
2023-09-27 08:25:40,450 [Shutdown Thread] INFO  o.t.s.d.s.BasicSubscriptionService - Terminating application due to critical License Error ACTIVE_INSTANCES_CAPACITY_EXCEEDED(104), exit code [-1]...

The weird part is that I literally only have one TB-edge instance so I don't really get how can I exceed the active capacity.

In an attempt to try to fix this: I've deactivated previous instance, removed the instance-edge-license.data and then started the tb-edge again. It created a new instance (active state in the image :point_down:)

image

But TB still complains about the exceeding the capacity.

volodymyr-babak commented 10 months ago

@AndreMaz,

Do you have an account on https://thingsboard-portal.atlassian.net/servicedesk/customer/portals? If so, could you please create a ticket in the TB Service Desk system so we can continue our discussion there? I would like to obtain your license and possibly other information to troubleshoot this problem.

AndreMaz commented 10 months ago

Yep, I have

What the topic that I should choose? The Tech Support? image

volodymyr-babak commented 10 months ago

Yes, Tech Support should be fine.

AndreMaz commented 10 months ago

Done, it's the CP-10857