eclipse-kura / kura

Eclipse Kura™ is a versatile framework to supercharge your edge devices, streamlining the process of configuring your gateway, connecting sensors, and IoT devices to seamlessly collect, process, and send data to the cloud.
https://eclipse.dev/kura/
Eclipse Public License 2.0
500 stars 309 forks source link

DATA-LIMIT-001 fails #2009

Closed ctron closed 6 years ago

ctron commented 6 years ago

Setting the period to 1000 ms causes the publisher to spit out messages way faster:

2018-04-03 14:46:04,226 [pool-11-thread-1] INFO  o.e.k.c.d.DataServiceImpl - Storing message on topic :#account-name/#client-id/EXAMPLE_PUBLISHER/data/metrics, priority: 5
2018-04-03 14:46:04,231 [pool-11-thread-1] INFO  o.e.k.c.d.DataServiceImpl - Stored message on topic :#account-name/#client-id/EXAMPLE_PUBLISHER/data/metrics, priority: 5
2018-04-03 14:46:04,231 [pool-11-thread-1] INFO  o.e.k.e.p.ExamplePublisher - Published to data/metrics message: org.eclipse.kura.message.KuraPayload@1e142d6 with ID: 2236
2018-04-03 14:46:04,234 [DataServiceImpl:Submit] INFO  o.e.k.c.d.t.m.MqttDataTransport - Publishing message on topic: account-name/B8:27:EB:97:D8:D5/EXAMPLE_PUBLISHER/data/metrics with QoS: 0
2018-04-03 14:46:04,239 [DataServiceImpl:Submit] INFO  o.e.k.e.p.ExamplePublisher - Published message with ID: 2236 on application topic: data/metrics
2018-04-03 14:46:04,240 [DataServiceImpl:Submit] INFO  o.e.k.e.p.ExamplePublisher - Published message with ID: 2236 on application topic: data/metrics
2018-04-03 14:46:04,240 [DataServiceImpl:Submit] INFO  o.e.k.e.p.ExamplePublisher - Published message with ID: 2236 on application topic: data/metrics
2018-04-03 14:46:04,240 [DataServiceImpl:Submit] INFO  o.e.k.e.p.ExamplePublisher - Published message with ID: 2236 on application topic: data/metrics
2018-04-03 14:46:04,241 [DataServiceImpl:Submit] INFO  o.e.k.e.p.ExamplePublisher - Published message with ID: 2236 on application topic: data/metrics
2018-04-03 14:46:04,242 [DataServiceImpl:Submit] INFO  o.e.k.e.p.ExamplePublisher - Published message with ID: 2236 on application topic: data/metrics
2018-04-03 14:46:04,274 [pool-12-thread-1] INFO  o.e.k.c.d.DataServiceImpl - Storing message on topic :#account-name/#client-id/EXAMPLE_PUBLISHER/data/metrics, priority: 5
2018-04-03 14:46:04,278 [pool-12-thread-1] INFO  o.e.k.c.d.DataServiceImpl - Stored message on topic :#account-name/#client-id/EXAMPLE_PUBLISHER/data/metrics, priority: 5
2018-04-03 14:46:04,279 [pool-12-thread-1] INFO  o.e.k.e.p.ExamplePublisher - Published to data/metrics message: org.eclipse.kura.message.KuraPayload@1782dc1 with ID: 2237
2018-04-03 14:46:04,281 [DataServiceImpl:Submit] INFO  o.e.k.c.d.t.m.MqttDataTransport - Publishing message on topic: account-name/B8:27:EB:97:D8:D5/EXAMPLE_PUBLISHER/data/metrics with QoS: 0
2018-04-03 14:46:04,285 [DataServiceImpl:Submit] INFO  o.e.k.e.p.ExamplePublisher - Published message with ID: 2237 on application topic: data/metrics
2018-04-03 14:46:04,286 [DataServiceImpl:Submit] INFO  o.e.k.e.p.ExamplePublisher - Published message with ID: 2237 on application topic: data/metrics
2018-04-03 14:46:04,286 [DataServiceImpl:Submit] INFO  o.e.k.e.p.ExamplePublisher - Published message with ID: 2237 on application topic: data/metrics
2018-04-03 14:46:04,286 [DataServiceImpl:Submit] INFO  o.e.k.e.p.ExamplePublisher - Published message with ID: 2237 on application topic: data/metrics
2018-04-03 14:46:04,287 [DataServiceImpl:Submit] INFO  o.e.k.e.p.ExamplePublisher - Published message with ID: 2237 on application topic: data/metrics
2018-04-03 14:46:04,288 [DataServiceImpl:Submit] INFO  o.e.k.e.p.ExamplePublisher - Published message with ID: 2237 on application topic: data/metrics
2018-04-03 14:46:04,337 [pool-13-thread-1] INFO  o.e.k.c.d.DataServiceImpl - Storing message on topic :#account-name/#client-id/EXAMPLE_PUBLISHER/data/metrics, priority: 5
2018-04-03 14:46:04,340 [pool-13-thread-1] INFO  o.e.k.c.d.DataServiceImpl - Stored message on topic :#account-name/#client-id/EXAMPLE_PUBLISHER/data/metrics, priority: 5
2018-04-03 14:46:04,341 [pool-13-thread-1] INFO  o.e.k.e.p.ExamplePublisher - Published to data/metrics message: org.eclipse.kura.message.KuraPayload@a45855 with ID: 2238
2018-04-03 14:46:04,344 [DataServiceImpl:Submit] INFO  o.e.k.c.d.t.m.MqttDataTransport - Publishing message on topic: account-name/B8:27:EB:97:D8:D5/EXAMPLE_PUBLISHER/data/metrics with QoS: 0
2018-04-03 14:46:04,347 [DataServiceImpl:Submit] INFO  o.e.k.e.p.ExamplePublisher - Published message with ID: 2238 on application topic: data/metrics
2018-04-03 14:46:04,348 [DataServiceImpl:Submit] INFO  o.e.k.e.p.ExamplePublisher - Published message with ID: 2238 on application topic: data/metrics
2018-04-03 14:46:04,348 [DataServiceImpl:Submit] INFO  o.e.k.e.p.ExamplePublisher - Published message with ID: 2238 on application topic: data/metrics
2018-04-03 14:46:04,349 [DataServiceImpl:Submit] INFO  o.e.k.e.p.ExamplePublisher - Published message with ID: 2238 on application topic: data/metrics
2018-04-03 14:46:04,349 [DataServiceImpl:Submit] INFO  o.e.k.e.p.ExamplePublisher - Published message with ID: 2238 on application topic: data/metrics
2018-04-03 14:46:04,350 [DataServiceImpl:Submit] INFO  o.e.k.e.p.ExamplePublisher - Published message with ID: 2238 on application topic: data/metrics
cdealti commented 6 years ago

@ctron what was the ExamplePublisher rate before setting it back to 1000ms? I don't see anything in the code explaining this behavior.

ctron commented 6 years ago

I think it was 1500ms

cdealti commented 6 years ago

@ctron BTW, did you forget to mention that you have created four configurations of ExamplePublisher with the same EXAMPLE_PUBLISHER app ID?

At least this is my guess looking at the short log excerpt above? The full log should show the log portion above repeated every second.

ctron commented 6 years ago

No, I only created one. But I did change the app ID a few times. Maybe there is a cleanup issue in Kura.

cdealti commented 6 years ago

@markoer, can you please recreate this?

markoer commented 6 years ago

@cdealti I sent you the logs on 22 March.

cdealti commented 6 years ago

@markoer OK, we didn't look at them. Do you have an idea of what's going on? Try to recreate this again and:

  1. From the OSGi console inspect the number of component configurations of ExamplePublisher (there should be only one)
  2. Using kill -3 <pidof Java> check the number of threads running the ExamplePublisher. The output of the command goes to kura-console.log (there should be only one thread).

I expect that for some reason the ExamplePublisher thread is not properly terminated and we end up with many of them.

markoer commented 6 years ago
osgi> comp 82
        Component[
        name = org.eclipse.kura.example.publisher.ExamplePublisher
        activate = activate
        deactivate = deactivate
        modified = updated
        configuration-policy = require
        factory = null
        autoenable = true
        immediate = true
        implementation = org.eclipse.kura.example.publisher.ExamplePublisher
        state = Unsatisfied
        properties = {service.pid=org.eclipse.kura.example.publisher.ExamplePublisher}
        serviceFactory = false
        serviceInterface = [org.eclipse.kura.configuration.ConfigurableComponent]
        references = null
        located in bundle = org.eclipse.kura.example.publisher_1.0.300 [137]
]
Dynamic information :
  The component is satisfied
  All component references are satisfied
  Component configurations :
    Configuration properties:
      metric.char = a
      metric.byte = 119
      metric.short = 32759
      publish.appTopic = data/metrics
      metric.integer = 2147483599
      component.id = 61
      publish.rate = 1000
      subscribe.topic = inbound/#
      metric.string.array = String[string.value1,string.value2,string.value3]
      metric.string.oneof = string.value.option.1
      cloud.service.pid = org.eclipse.kura.cloud.CloudService
      component.name = org.eclipse.kura.example.publisher.ExamplePublisher
      metric.temperature.initial = 10.0
      metric.short.array = Object[10,20,30]
      objectClass = String[org.eclipse.kura.configuration.ConfigurableComponent]
      metric.double = 4.29496729599998E9
      publish.retain = false
      metric.temperature.increment = 0.1
      metric.string = string.value
      metric.password = xea2sebrvKJQEW1YRDEEGg==
      service.factoryPid = org.eclipse.kura.example.publisher.ExamplePublisher
      publish.qos = 0
      service.pid = org.eclipse.kura.example.publisher.ExamplePublisher-1522915095875-1
      app.id = EXAMPLE_PUBLISHER
      metric.boolean = false
      kura.service.pid = expub
      metric.float = 32766.98
      metric.integer.fixed = 101
      metric.long = 9223372036854774999
    Instances:
      org.eclipse.equinox.internal.ds.impl.ComponentInstanceImpl@1659e05

In my latest testing 7 threads are sending messages:

2018-04-05 09:58:28,953 [pool-7-thread-1] INFO  o.e.k.e.p.ExamplePublisher - Published to data/metrics message: org.eclipse.kura.message.KuraPayload@1ea972a with ID: 17
2018-04-05 09:58:28,995 [pool-8-thread-1] INFO  o.e.k.e.p.ExamplePublisher - Published to data/metrics message: org.eclipse.kura.message.KuraPayload@1dfa4db with ID: 18
2018-04-05 09:58:29,028 [pool-9-thread-1] INFO  o.e.k.e.p.ExamplePublisher - Published to data/metrics message: org.eclipse.kura.message.KuraPayload@9eadf3 with ID: 19
2018-04-05 09:58:29,061 [pool-10-thread-1] INFO  o.e.k.e.p.ExamplePublisher - Published to data/metrics message: org.eclipse.kura.message.KuraPayload@1e3edb4 with ID: 20
2018-04-05 09:58:29,084 [pool-11-thread-1] INFO  o.e.k.e.p.ExamplePublisher - Published to data/metrics message: org.eclipse.kura.message.KuraPayload@6036c3 with ID: 21
2018-04-05 09:58:29,120 [pool-12-thread-1] INFO  o.e.k.e.p.ExamplePublisher - Published to data/metrics message: org.eclipse.kura.message.KuraPayload@f17ce5 with ID: 22
2018-04-05 09:58:29,441 [pool-23-thread-1] INFO  o.e.k.e.p.ExamplePublisher - Published to data/metrics message: org.eclipse.kura.message.KuraPayload@14b8646 with ID: 23

All of them can be found in kura-consol.log:

"pool-7-thread-1" #43 prio=5 os_prio=0 tid=0x00e87800 nid=0x3e0 waiting on condition [0x4f5dd000]
   java.lang.Thread.State: TIMED_WAITING (parking)
        at sun.misc.Unsafe.park(Native Method)
        - parking to wait for  <0x607dda18> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
        at java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:215)
        at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:2078)
        at java.util.concurrent.ScheduledThreadPoolExecutor$DelayedWorkQueue.take(ScheduledThreadPoolExecutor.java:1093)
        at java.util.concurrent.ScheduledThreadPoolExecutor$DelayedWorkQueue.take(ScheduledThreadPoolExecutor.java:809)
        at java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1067)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1127)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
        at java.lang.Thread.run(Thread.java:745)

Procedure followed to reproduce:

I can see from the log that ExamplePublisher's activate() is called 8 times, its configuration is removed from configuration service once and updated once (I'm attaching the logs in a separate mail).

ctron commented 6 years ago

What you reported exactly matches what I experienced as well.

I do think there are two issues here. One is the fact that Kura, during the startup, re-activates components. I have seen this with Kura 3.1 already.

The other one is that the example publisher doesn't properly clean up after itself. So it keeps running although being deactivated.

I do think that the threads are actually irrelevant, as a scheduled executor is being used. Which has a thread pool (or a single thread) and might share that thread pool between different jobs. So 10 jobs might still be running on a single thread.

cdealti commented 6 years ago

@markoer

I'm attaching the logs in a separate mail You can drag and drop a zip file with the logs here in the comment ;-).

Is there any exception in kura-console.log?

I do think there are two issues here. One is the fact that Kura, during the startup, re-activates components. I have seen this with Kura 3.1 already.

We all know there are unnecessary activations/updates. However the application must be able to perform the necessary cleanup. Also note that the component deactivation is logged with DEBUG level so activation and deactivation log lines are not balanced.

Marko, can you please debug this problem and provide a fix?

markoer commented 6 years ago

You can drag and drop a zip file with the logs here in the comment ;-).

In this case, here they are. kura-console_2009.log log_2009.txt

There are plenty of exceptions there.

I will check if I can find something.

cdealti commented 6 years ago

I believe that exception will shed some light on this issue and all the others involving the ExamplePublisher.

markoer commented 6 years ago

@cdealti Good hint. It seems that when component activation failed with an exception, framework tried to activate it again and again until it gave up with 6-7 publishers running.

MMaiero commented 6 years ago

It seems to be verified on the supported platforms. Let's close for now