eclipse-leshan / leshan

Java Library for LWM2M
https://www.eclipse.org/leshan/
BSD 3-Clause "New" or "Revised" License
652 stars 407 forks source link

Congestion control for LWM2M request received . #827

Closed madhushreegc closed 4 years ago

madhushreegc commented 4 years ago

Hi ,

Leshan version M10

I want to limit the number the number of request received in LWM2M server side .

Is there any feature that leshan exposes to control the number of request received?

For example , LWM2M server should receive only 10 registration request for a second . remaining request should be discarded .

BR, Madhu.

sbernard31 commented 4 years ago

There is nothing out of the box to do that.

At CoAP/Californium side :

Leshan is based on Californium (CoAP implementation). I know that CoAP should support some basic congestion control (see RFC-7252§4.7-Congestion Control) but this is not exactly what you described. About CoAP congestion control, I know there is experimental code(not activated by default) about that in Californium (look at CongestionControlLayer) @boaks Do you know the state of this code/feature in Californium ?

At LWM2M/Leshan side :

For example , LWM2M server should receive only 10 registration request for a second . remaining request should be discarded .

You can eventually implement your own Authorizer. And stop to authorize device, if Authorizer is call more than 10 times by seconds. This will not discard the request but answer with a forbidden response. (You could also have a look at the default implementation DefaultAuthorizer)

(Keep in mind this is at LWM2M level, so this will not prevent any DTLS handshake)

HTH

boaks commented 4 years ago

CoAP's "Congestion Control" is a "cooperative" one, it describes how compliant clients and servers should act (e.g. increasing backoff time on failure). AFAIK, the implementation in californium has still some multi-threading issues (see Congestion Control not working for Server ).

I would also distinguish on limit devices (maybe for commercial reasons) or limits caused by the servers infrastructure. This issue seems to target the first one.

For example , LWM2M server should receive only 10 registration request for a second . remaining request should be discarded .

To detect, that a request is a "registration", would at least require to limit requests on resources. The only functionality, that may be used for that is the Californium's MessageInterceptor, but I don't know a "out of the box" solution, it's just the base to build your custom solution.

For the other limit, the servers infrastructure, it's important to have the heap-size, number of messages, and the exchange-lifetime in balance, but thatis special to every system/installation. Usually 247s default exchange_lifetime require too much heap, and 60s will mostly work similar good and uses only 25% of the heap of 247s. But, juts find your own best match.

madhushreegc commented 4 years ago

On sending more 6 registration request at once I am getting below error and restart of server is required .

03:44:18,843 ERROR [SendableResponse] Exception while calling the reponse sent callback java.lang.NullPointerException: null

My PROTOCOL_STAGE_THREAD_COUNT is set to 4.

Wanted to avoid this.

Thank you Simon and Boaks for your inputs.

boaks commented 4 years ago

I would guess, this is more a multi-threading issue than a congestion issue. Maybe @sbernard31 can spend some time in investigate more about the scope/root-cause of that.

sbernard31 commented 4 years ago

I agree with @boaks , I would not guess on congestion issue here. Not even sure this is a multi-threading issue.

Please share the full stacktrace. This exception happens when there is an exception raised in "sent" callback of SendableResponse. So 2 solutions :

The stacktrace should give some hints.

madhushreegc commented 4 years ago

NPE is there in my RegistraitonLIstners but the server is stops processing because of less number of threads .

14:06:39,843 DEBUG [BaseMatcher] registering observe request CON-GET MID= -1, Token=null, OptionSet={"Observe":0, "Uri-Path":"26247", "Accept":"application/vnd.oma.lwm2m+json"}, no payload 14:06:39,843 ERROR [SendableResponse] Exception while calling the reponse sent callback java.lang.NullPointerException: null at com.iot.dc.lwm2m.south.service.leshan.listener.CustomRegistrationListenerImpl.registered(CustomRegistrationListenerImpl.java:138) ~[classes:?] at org.eclipse.leshan.server.impl.RegistrationServiceImpl.fireRegistered(RegistrationServiceImpl.java:80) ~[leshan-server-core-1.0.0-M10.jar:?] at org.eclipse.leshan.server.registration.RegistrationHandler$1.run(RegistrationHandler.java:84) ~[leshan-server-core-1.0.0-M10.jar:?] at org.eclipse.leshan.core.response.SendableResponse.sent(SendableResponse.java:47) [leshan-core-1.0.0-M10.jar:?] at org.eclipse.leshan.server.californium.impl.RegisterResource.handleRegister(RegisterResource.java:200) [leshan-server-cf-1.0.0-M10.jar:?] at org.eclipse.leshan.server.californium.impl.RegisterResource.handlePOST(RegisterResource.java:117) [leshan-server-cf-1.0.0-M10.jar:?] at org.eclipse.californium.core.CoapResource.handleRequest(CoapResource.java:219) [californium-core-2.0.0-M12.jar:?] at org.eclipse.leshan.server.californium.impl.RegisterResource.handleRequest(RegisterResource.java:85) [leshan-server-cf-1.0.0-M10.jar:?] at org.eclipse.californium.core.server.ServerMessageDeliverer.deliverRequest(ServerMessageDeliverer.java:108) [californium-core-2.0.0-M12.jar:?] at org.eclipse.californium.core.network.stack.BaseCoapStack$StackTopAdapter.receiveRequest(BaseCoapStack.java:158) [californium-core-2.0.0-M12.jar:?] at org.eclipse.californium.core.network.stack.AbstractLayer.receiveRequest(AbstractLayer.java:81) [californium-core-2.0.0-M12.jar:?] at org.eclipse.californium.core.network.stack.AbstractLayer.receiveRequest(AbstractLayer.java:81) [californium-core-2.0.0-M12.jar:?] at org.eclipse.californium.core.network.stack.BlockwiseLayer.receiveRequest(BlockwiseLayer.java:372) [californium-core-2.0.0-M12.jar:?] at org.eclipse.californium.core.network.stack.ReliabilityLayer.receiveRequest(ReliabilityLayer.java:261) [californium-core-2.0.0-M12.jar:?] at org.eclipse.californium.core.network.stack.AbstractLayer.receiveRequest(AbstractLayer.java:81) [californium-core-2.0.0-M12.jar:?] at org.eclipse.californium.core.network.stack.BaseCoapStack.receiveRequest(BaseCoapStack.java:98) [californium-core-2.0.0-M12.jar:?] at org.eclipse.californium.core.network.CoapEndpoint$InboxImpl$2.runStriped(CoapEndpoint.java:921) [californium-core-2.0.0-M12.jar:?] at org.eclipse.californium.core.network.StripedExchangeJob.run(StripedExchangeJob.java:65) [californium-core-2.0.0-M12.jar:?] at eu.javaspecialists.tjsn.concurrency.stripedexecutor.StripedExecutorService$SerialJob.run(StripedExecutorService.java:548) [element-connector-2.0.0-M12.jar:?] at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) [?:1.8.0_131] at java.util.concurrent.FutureTask.run(FutureTask.java:266) [?:1.8.0_131] at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180) [?:1.8.0_131] at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293) [?:1.8.0_131] at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) [?:1.8.0_131] at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) [?:1.8.0_131] at java.lang.Thread.run(Thread.java:748) [?:1.8.0_131] 14:06:40,111 DEBUG [UDPConnector] [UDP-Receiver-/15.213.51.145:17683[0]] UDPConnector (/15.213.51.145:17683) received 179 bytes from /15.213.51.145:42369 14:06:40,228 DEBUG [UDPConnector] [UDP-Receiver-/15.213.51.145:17683[0]] UDPConnector (/15.213.51.145:17683) received 179 bytes from /15.213.51.145:47863 14:06:40,551 DEBUG [UDPConnector] [UDP-Receiver-/15.213.51.145:17683[0]] UDPConnector (/15.213.51.145:17683) received 179 bytes from /15.213.51.145:34638 14:06:40,619 DEBUG [UDPConnector] [UDP-Receiver-/15.213.51.145:17683[0]] UDPConnector (/15.213.51.145:17683) received 179 bytes from /15.213.51.145:48444 14:06:40,832 DEBUG [UDPConnector] [UDP-Receiver-/15.213.51.145:17683[0]] UDPConnector (/15.213.51.145:17683) received 179 bytes from /15.213.51.145:56324 14:06:40,988 DEBUG [UDPConnector] [UDP-Receiver-/15.213.51.145:17683[0]] UDPConnector (/15.213.51.145:17683) received 179 bytes from /15.213.51.145:31450 14:06:41,135 DEBUG [UDPConnector] [UDP-Receiver-/15.213.51.145:17683[0]] UDPConnector (/15.213.51.145:17683) received 179 bytes from /15.213.51.145:21928 14:06:41,504 DEBUG [UDPConnector] [UDP-Receiver-/15.213.51.145:17683[0]] UDPConnector (/15.213.51.145:17683) received 179 bytes from /15.213.51.145:58293 14:06:41,530 TRACE [SweepDeduplicator] [Deduplicator1] Start Mark-And-Sweep with 575 entries 14:06:41,530 DEBUG [SweepDeduplicator] [Deduplicator1] Sweep run took 0ms 14:06:41,540 TRACE [SweepDeduplicator] [Deduplicator1] Start Mark-And-Sweep with 0 entries 14:06:41,553 TRACE [SweepDeduplicator] [Deduplicator1] Start Mark-And-Sweep with 0 entries 14:06:41,554 TRACE [SweepDeduplicator] [Deduplicator1] Start Mark-And-Sweep with 0 entries 14:06:41,673 DEBUG [UDPConnector] [UDP-Receiver-/15.213.51.145:17683[0]] UDPConnector (/15.213.51.145:17683) received 179 bytes from /15.213.51.145:21384 ^[14:06:42,931 DEBUG [UDPConnector] [UDP-Receiver-/15.213.51.145:17683[0]] UDPConnector (/15.213.51.145:17683) received 179 bytes from /15.213.51.145:34321 14:06:42,963 DEBUG [UDPConnector] [UDP-Receiver-/15.213.51.145:17683[0]] UDPConnector (/15.213.51.145:17683) received 179 bytes from /15.213.51.145:42369 14:06:43,274 DEBUG [UDPConnector] [UDP-Receiver-/15.213.51.145:17683[0]] UDPConnector (/15.213.51.145:17683) received 179 bytes from /15.213.51.145:39520 14:06:43,399 DEBUG [UDPConnector] [UDP-Receiver-/15.213.51.145:17683[0]] UDPConnector (/15.213.51.145:17683) received 179 bytes from /15.213.51.145:31450 14:06:43,460 DEBUG [UDPConnector] [UDP-Receiver-/15.213.51.145:17683[0]] UDPConnector (/15.213.51.145:17683) received 179 bytes from /15.213.51.145:38900 14:06:43,917 DEBUG [UDPConnector] [UDP-Receiver-/15.213.51.145:17683[0]] UDPConnector (/15.213.51.145:17683) received 179 bytes from /15.213.51.145:65389 14:06:44,755 DEBUG [UDPConnector] [UDP-Receiver-/15.213.51.145:17683[0]] UDPConnector (/15.213.51.145:17683) received 179 bytes from /15.213.51.145:65268 14:06:44,882 DEBUG [UDPConnector] [UDP-Receiver-/15.213.51.145:17683[0]] UDPConnector (/15.213.51.145:17683) received 179 bytes from /15.213.51.145:48749

sbernard31 commented 4 years ago

NPE is there in my RegistraitonLIstners but the server is stops processing because of less number of threads .

What make you think that ?

To understand the NPE I need to see the CustomRegistrationListenerImpl code.

madhushreegc commented 4 years ago

I used Synchronous observe call inside registrationListner and since device was not responding this thread was blocked . I changed that to Async observe request .

I am not seeing this issue now after the change.

@sbernard31 I will not face this issue again what ever may be load correct when threads are not blocked?

boaks commented 4 years ago

I will not face this issue again what ever may be load correct when threads are not blocked?

I'm not sure, what you mean. Generally, in every programming language, in every system, it is never a too good idea to block a thread in a callback.

sbernard31 commented 4 years ago

The RegistrationListener javadoc says :

/**
 * Listen for client registration events.
 * <p>
 * Those methods are called by the protocol stage thread pool,
 * this means that execution MUST be done in a short delay,
 * if you need to do long time processing use a dedicated thread pool.
 */
public interface RegistrationListener {

So this is clearly a better idea to use the async send API. But for me, the NPE is not related to that.

I will not face this issue again what ever may be load correct when threads are not blocked?

I do not understand too.

boaks commented 4 years ago

Wild guess.

So, do you wait for the response? In vain? Therefore NPE?

sbernard31 commented 4 years ago

So, the NPE happens because the code which call the sync send does not handle the case where response is null (which is a normal usecase because null means timeout with the sync API).

/**
 * ....
 * @return the LWM2M response. The response can be <code>null</code> if the timeout expires (see
 *         https://github.com/eclipse/leshan/wiki/Request-Timeout).
 * ....
 *∕

And the timeout happens because all protocol stage thread are blocked and so not able to send or receive message.

But @madhushreegc, you force us to make assumption because you don't share the code where the NPE happens...

boaks commented 4 years ago

Some are simply not allowed to share their code. That is not too uncommon. But maybe, now with these ideas to look at, it may get fixed.

sbernard31 commented 4 years ago

Not allowed to share 5 line of code ? mmhh I doubt and if this is the case I guess that "police" will not know.

If you are really afraid about that there is still the possibility to rewrite just a bit the code.

madhushreegc commented 4 years ago

I am so sorry.

I missed attaching the code .

Below was the code inside register method of RegistraionListner.

LeshanServer leshanServer = leshanServerServiceBean.getLeshanServer(); ObserveRequest observeRequest = new ObserveRequest(contentFormat, resourcePath); ObserveResponse observeResponse = leshanServer.send(registration, observeRequest, SEND_TIMEOUT);

Yes , your assumption is right . Timeout happened so response returned null and since it was not handled threw NPE . All the threads were blocked . I was testing with EDRx devices.

sbernard31 commented 4 years ago

Thx @madhushreegc for the detailed information. :pray:

So using Async API is the right way to do.

Could we close this issue or do you have still some concern about this issue ?

madhushreegc commented 4 years ago

Yes . We can close the case . Thank you very much for all the clarifications.