Com-AugustCellars / CoAP-CSharp

CoAP Implementation in C#
Other
41 stars 19 forks source link

Observer turns into polling #6

Closed jimsch closed 7 years ago

jimsch commented 7 years ago

Following the guidance of the last paragraph of section 3.3.1, if an observation is setup and it goes stale, the code currently assumes that it has lost the observer at the server and attempts to re-create the same relationship with the same token. While this is a good idea in concept, the fact that it is done w/o any input from the end-user is a problem.

The current code does the refresh after Max-Age plus a constant number of sections taken from the configuration (defaults to 2). This means that it will occur every 62 seconds by default.

The following changes are needed:

  1. A way for the client to either opt-in or opt-out of the behavior
  2. A limit on the number of retries to be done before giving up.
  3. A better randomization on the back-off rate rather than use a constant from the configuration.
dominionish commented 7 years ago

Yes, this is exactly what happen. But can we ignore MaxAge and get observer data from server unlimit time, while observer not cancel? As I know client shouldnt recreate relationship with server. This should do border router or server. On some time server send ack and if it success recourse saved othervise resource canceled.

dominionish commented 7 years ago

As example you can see Copper (extention for Mozilla for working with coap). It work perfect. And always get observer data even if server unaccessable a couple hourds. Unfortunally it's just javaScript extention, but not C# dll)) Anyway thank you for your work. Hope you will solve all this problems. I can test your code on my coap devices

jimsch commented 7 years ago

This is now turned off by default

dominionish commented 7 years ago

Yes, I checked this version. And found something:

  1. _message.FireRetransmitting(); on ReliabilityLayer still calls, so after 5 calls timeoutError occured. So I change code and it fixed:

    ...............
                else if (failedCount <= (_message.MaxRetransmit != 0 ? _message.MaxRetransmit : _config.MaxRetransmit) && (_message.Observe!=0 || _exchange.Request.ObserveReconnect))
                {
                    if (log.IsDebugEnabled)
                        log.Debug("Timeout: retransmit message, failed: " + failedCount + ", message: " + _message);
    
                    _message.FireRetransmitting();
    
                    // message might have canceled
                    if (!_message.IsCancelled)
                        _retransmit(this);
                }
                else if (_message.Observe != 0 || (_message.Observe == 0 && failedCount > (_message.MaxRetransmit != 0 ? _message.MaxRetransmit : _config.MaxRetransmit) || _exchange.Request.ObserveReconnect))
                {
                    if (log.IsDebugEnabled)
                        log.Debug("Timeout: retransmission limit reached, exchange failed, message: " + _message);
                    _exchange.TimedOut = true;
                    _message.IsTimedOut = true;
                    _exchange.Remove(TransmissionContextKey);
                    Cancel();
                }
    ...............

    So FireRetransmitting doesnt call if message observable. But if _exchange.Request.ObserveReconnect = true it become callable.

  2. After changing on 1 point i found new bug: If server not available and we trying to observe resource, observer callback will never occured, because timer calls only 1 time, so on ReliabilityLayer on TransmissionContext constructor i changed code:
                if (_message.Observe==0)
                    _timer.AutoReset = true;
                else _timer.AutoReset = false;
  3. And the last: I observer resourse. Then server become sleep. There is no polling, as you fixed it, and after 60 sec (Max-Age) there is no timeout, as you fixed, but when server wake up and send observer update - data doesnt not come. If server wakes up just in 1-60 seconds - all is ok - observer data comes, but if wakes up in 61 - more seconds - observer data doesn't come. It seems there is some more logic about Max-Age. May be after Max-Age seconds UDP closed if server is not available? I didnt find something like this, but the problem exists. Or may be this is because of my changes above...
dominionish commented 7 years ago

1 problem with timeout fixed after I builded project using C# 7.0 Compiler. But the problem with no observer data didnt solved

jimsch commented 7 years ago

Lets start with the basics. The first set of code you modified deals with the fact that UDP is not a reliable transport. This means that if one sends a UDP message it may not get delivered. The code allows for a small set of pre-configured retries in this case. This needs to happen independently if it being an observer request or a normal request.

The problem with trying to make sure that a confirmable message is delivered to a device which might currently be asleep is that if the device either is not there or never wakes up then you are using a fair amount of network traffic until such a time as the program would be killed.

There is an easier way to get the "I do not want to re-try" by using the MessageType.Non value. This can be set either in the constructor or by setting Request.Type. Doing this says that I don't care about making sure the message gets delivered to the destination.

You should write code in the application to deal with a device you want to talk to and it may not be awake at any given time. Although it is generally better to put some type of proxy agent closer to the device which does this if you need to talk to it from multiple devices.

I would suggest doing code in the application along the lines of

BEGIN CODE

Request request = new Request(Method.GET) { URI = new URI(....) }; request.MarkObserve();

Response response = null;

while (response == null) { response = request.WaitForResponse(1000*60); // Wait sixty seconds or a response; }; Console.WriteLine(Utils.ToString(response)); /// Print so that we can look at it.

if (!response.HasOption(OptionType.Observe)) Console.WriteLine("Did not get an observe response!");

END CODE

While the device is asleep, it would send the message 5 times every minute until it gets a response.

One thing to remember is that if you go through a router to talk to the device, the routing may get dropped at some point if insufficient traffic goes through it.

jgmdavies commented 7 years ago

Hi Jim,

Couldn’t it be in that while loop forever, if there’s no response?

            Thanks,

            Jim

From: Jim Schaad [mailto:notifications@github.com] Sent: 29 May 2017 19:51 To: jimsch/CoAP-CSharp CoAP-CSharp@noreply.github.com Cc: Subscribed subscribed@noreply.github.com Subject: Re: [jimsch/CoAP-CSharp] Observer turns into polling (#6)

Lets start with the basics. The first set of code you modified deals with the fact that UDP is not a reliable transport. This means that if one sends a UDP message it may not get delivered. The code allows for a small set of pre-configured retries in this case. This needs to happen independently if it being an observer request or a normal request.

The problem with trying to make sure that a confirmable message is delivered to a device which might currently be asleep is that if the device either is not there or never wakes up then you are using a fair amount of network traffic until such a time as the program would be killed.

There is an easier way to get the "I do not want to re-try" by using the MessageType.Non value. This can be set either in the constructor or by setting Request.Type. Doing this says that I don't care about making sure the message gets delivered to the destination.

You should write code in the application to deal with a device you want to talk to and it may not be awake at any given time. Although it is generally better to put some type of proxy agent closer to the device which does this if you need to talk to it from multiple devices.

I would suggest doing code in the application along the lines of

BEGIN CODE

Request request = new Request(Method.GET) { URI = new URI(....) }; request.MarkObserve();

Response response = null;

while (response == null) { response = request.WaitForResponse(1000*60); // Wait sixty seconds or a response; }; Console.WriteLine(Utils.ToString(response)); /// Print so that we can look at it.

if (!response.HasOption(OptionType.Observe)) Console.WriteLine("Did not get an observe response!");

END CODE

While the device is asleep, it would send the message 5 times every minute until it gets a response.

One thing to remember is that if you go through a router to talk to the device, the routing may get dropped at some point if insufficient traffic goes through it.

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/jimsch/CoAP-CSharp/issues/6#issuecomment-304714720 , or mute the thread https://github.com/notifications/unsubscribe-auth/ADmYN-ZPo1VWf9Ro1Fofh_Zb7kOGCqZDks5r-xOEgaJpZM4Nlbt4 . https://github.com/notifications/beacon/ADmYNz1Y6XWAORvKATLLdtKTfGOtqcbCks5r-xOEgaJpZM4Nlbt4.gif

jimsch commented 7 years ago

Yes - this could be in an infinite loop, but that was going to be true for what he was doing to. The assumption however is that either the app can be killed or the device really comes alive at some point. It would be good to put an exit if it goes too long

jgmdavies commented 7 years ago

Ah, thanks Jim. So it’s the same as having just one call (not in a loop), with an infinite timeout?

From: Jim Schaad [mailto:notifications@github.com] Sent: 29 May 2017 23:35 To: jimsch/CoAP-CSharp CoAP-CSharp@noreply.github.com Cc: jgmdavies J.Davies@jacobus.co.uk; Comment comment@noreply.github.com Subject: Re: [jimsch/CoAP-CSharp] Observer turns into polling (#6)

Yes - this could be in an infinite loop, but that was going to be true for what he was doing to. The assumption however is that either the app can be killed or the device really comes alive at some point. It would be good to put an exit if it goes too long

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/jimsch/CoAP-CSharp/issues/6#issuecomment-304739831 , or mute the thread https://github.com/notifications/unsubscribe-auth/ADmYNwApwAiJUmimYtnoP4oZd09enoVrks5r-0gkgaJpZM4Nlbt4 . https://github.com/notifications/beacon/ADmYN9m51hYU-LkJs62vAgaJg4N9gUYGks5r-0gkgaJpZM4Nlbt4.gif

jimsch commented 7 years ago

No, this is not quite the same as having a single call with an infinite timeout. It will produce more network traffic as it will send out a burst of UDP messages every minute. The single call will produce only a single burst of UDP messages.

jgmdavies commented 7 years ago

Of course, silly me!

I suppose the behaviour could be built into ‘WaitForResponse’, with an extra parameter or two, but this pattern is more open and controllable.

Many thanks for all your work on CoAP-CSharp – I’m looking forward to getting to grips with it.

Is this the best forum, and is there a working COAP test server around at the moment please?

            Many thanks,

            Jim

From: Jim Schaad [mailto:notifications@github.com] Sent: 30 May 2017 08:46 To: jimsch/CoAP-CSharp CoAP-CSharp@noreply.github.com Cc: jgmdavies J.Davies@jacobus.co.uk; Comment comment@noreply.github.com Subject: Re: [jimsch/CoAP-CSharp] Observer turns into polling (#6)

No, this is not quite the same as having a single call with an infinite timeout. It will produce more network traffic as it will send out a burst of UDP messages every minute. The single call will produce only a single burst of UDP messages.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/jimsch/CoAP-CSharp/issues/6#issuecomment-304801014 , or mute the thread https://github.com/notifications/unsubscribe-auth/ADmYNxY2N_yabEqDMi5FjI99N3D55Ce5ks5r-8lJgaJpZM4Nlbt4 . https://github.com/notifications/beacon/ADmYN19Q-g--yyy4aJwSGljI_U-2gP6uks5r-8lJgaJpZM4Nlbt4.gif

jimsch commented 7 years ago

There is a test server as part of the project, it does not get distributed via nuget but you can build it and play. You can also look at the website http://coap.me/ which has a pointer to server and client resources that you can look at. Additionally, there is always the JAVA version on github called californium. I will however respond to things here as best I can as well.

dominionish commented 7 years ago

Hello, Jim! Thank you for you answer!

  1. About proxy agent you absolutely right. We are going to place it near sensortag (server - device, wich sleep 60 minutes and not sleep only 20 seconds) and it (proxy) will be online always. But now we need to implement solution without proxy agent.
  2. About "routing may get dropped at some point if insufficient traffic goes through it" you right too. We use OS Contiki and it can enable probing. This mean that it talk to border router periodically and so routing may not get dropped. This is not problem in our case.
  3. About request.WaitForResponse. Now to get data from Sensortag I use request.Respond += (s, e) => myMethod(s, e); So what the principal difference? Can I manage to use timeout in this case like request.WaitForResponse(1000*60);? Or how can I use request.Respond to get data always without polling even if sensortag sleep long time.
  4. About basics. I didnt get it at all. So can you explain me please it in my example: I manage sensortag to sleep 60 seconds and not sleep 15 seconds. I started my client application, using your coap repository, and started wireshark(application to see all trafic like sniffer. Here I can see if something come from sensortag to my computer). SO I observer to one resourse and every 60 seconds a message come from sensortag to my application client to myMethod function. And i can see this traffic in wireshark: image This is ideal situation. I need the same behavior when sensortag sleep more than 60 seconds. So lets try to sleep 120 seconds and see what will happen: image As you can see there is a trafic from sensortag to my computer on the right hand of screenshot (Wireshark). But there is no message came to my application to myMethod function. Why exactly this happen? Something like timeout happen? How can I know when this timeout occured and how can I restore connection? Using _retransmit? If yes, how can I do it? And as the example I want to show you Copper. The same situation but instead my application I use Coap client Copper(Firefox extention): Sleep - 120 seconds, not sleep -15 seconds: image As you can see the same trafic in wireshark, but data came to Copper client (double data in wireshark because first - to application, second - to Copper) How do you think what the reason can be?
jimsch commented 7 years ago

Lots of good information -thanks. Can you capture console output or have you ever used the fancier Common.Logging captures - specifically NLog? I want to see if I can get a capture of the internal logging information for this.

jimsch commented 7 years ago

Are you placing the request object into a global memory location or is it a local that goes out of scope?

dominionish commented 7 years ago

I had enabled log on another repository, but I forgot enable it using yours. I'll repeat all this actions, but with enabled log in 7-8 hours and will post here

dominionish commented 7 years ago

I save request object into a global memory location to have a posibility to cancel Observer when I need.

dominionish commented 7 years ago

image SensorUI.txt At 2017-05-31 21:20:44.760133 message didnt come to SensorUI application as you can see on screenshot. Hope this log will help you

jimsch commented 7 years ago

It looks like the file got truncated and there is no request w/ an observe

dominionish commented 7 years ago

Hm... And there is no time in log. How can I swith on full log?

jimsch commented 7 years ago

What are you using for logging - just the default console logging or one of the common.logging modules? It is going to very depending on what is being used.

jimsch commented 7 years ago

I am not really worried about times, but the following might help - it has a date format in the layout.

dominionish commented 7 years ago

image logUI.txt used log4net Hope now it full

jimsch commented 7 years ago

The good news is that I have figured out a lot of places where the problem isn't. This means that things have narrowed down. I should be releasing an update in the next day or two with additional tracing in new places to see if I can narrow the problem down farther. I think it may be a restart issue that I fixed in a different place.

dominionish commented 7 years ago

Ok, realy good news. Tell me when you will push changes, I will give you new log.

jimsch commented 7 years ago

Ok - 1.1.4 is now pushed out

dominionish commented 7 years ago

image logUI.txt Hello! Here is new logs information

jimsch commented 7 years ago

I really cannot see how it can possibly be failing at this point. Basically, I am not seeing a UDP message being received. Three possibilities - no message to receive, message is not being read by me, message is getting lost between hardware and my program. I cannot see how to deal with the last.

Are you using 4.5 or 4.0 for the framework? What OS are you running on?

dominionish commented 7 years ago

Good day! About port number I didnt check. You think that it could be changed by sensortag? I'll check this. I use CoAP.NET40 project, so I use framework 4.0. OS Windows 7 Maximum. The fact is - Copper work and receive message. I'll check it again today and will receive you a sceenshot. Ma y be Copper get messages from any ports, even if it changes - I dont know. I will send you sensortag log too. May be UDPConnection has something like timeout = 60 secs. And if you will not recheck it - it will close.

dominionish commented 7 years ago

image logUI.txt

putty.txt

Port didnt changed during observer period. Copper work, UI - dont. All logs include sensortag log included. The only help i can get you is get you remote access to my computer by TeamViewer if you need debug online

jimsch commented 7 years ago

Yeah, I didn't really think it was a port problem, but I am starting to grasp at straws.

I have used this to observe a resource that fires an event every 3 minutes without any problems. I don't know if I want it to succeed or fail.

dominionish commented 7 years ago

Hello! Sory for long waiting! I checked your examples in branch master and here are results:

  1. CoAPClient -e OBSERVE coap://[fd00::212:4b00:c4a:5c80]:5683/rest/counter/total - dont work at all. No one message came. I dont know why: image
  2. CoAPClient -l OBSERVE coap://[fd00::212:4b00:c4a:5c80]:5683/rest/counter/total work, but only if sensortg sleep less than 60 sec. If it sleeps more than 60 seconds, message doesnt come as before. So this is the same problem in you client example as before in mine. image

And one more. I enabled polling to make my application work. And observe on 2 resourse. Sensortag sleeps 1 hour, then not Sleep 30 seconds. In this case messages came and all worked about 6 hours, then messages didnt came and in log i found this: Timeout: message already acknowledged, cancel retransmission of CON-GET ID=22551, Token=44FDC510, Options=[URI-Path=rest, counter, total, Observe=0], [no payload] What does it mean? Why new ack message stops retransmission? Here is full log: logUI.txt.2017-06-09(important).txt

jimsch commented 7 years ago

Just to double check - are you using a build version or a nuget version of the library? I am still not seeing some logging that I expect to see.

The message you are worried about is expected. If I send a message and one gets back an ACK on for it. Then it will not be re-transmitted - that is what this is saying. I haven't looked yet hard enough to see which message it is being done for.

dominionish commented 7 years ago

I cloned the last version from you repository from master branch, built it and ran. Not NuGet Version

And check please how to make observer go on even after this message (ACK)

jimsch commented 7 years ago

If you look at the file UDPChannel.NET40.cs on line 28, there should be log message. Please make sure that it is enabled and logging so that I can see what is happening in the UDP code itself.

dominionish commented 7 years ago

image Where did you implement LOG_UDP_CHANNEL? I didnt find any implementation _Log variable was not implemented, so if you'll delete #if LOG_UDP_CHANNEL and #endif, There will be error: image May be I dont understand something

jimsch commented 7 years ago

You will find the _Log in UDPSocket - these two files together implement the class. The define should be on for debug compiles.

jimsch commented 7 years ago

I got it to reproduce - and we have a winner. Now to figure out how to get around the winner.

The firewall is blocking the messages from being delivered to the application. This is why it worked just fine for me because I was running the server and client on the same machine so the firewall did not come into effect. My timeout seems to be around 4 minutes rather than your really short one. If you can see if disabling your client firewall fixes the problem

jimsch commented 7 years ago

Ok - I have done several things that validate that problem appears to be the firewall. It appears that the timeout period on windows is highly dependent on both the OS and hardware but starts in as little as 64 seconds (See https://stackoverflow.com/questions/14856639/udp-hole-punching-timeout for a way to look for the value.) This value is hard coded into the OS and cannot be changed programmatically or by a configuration change to the best of my knowledge. In essence there is a router sitting on the edge of the machine to help mess things up.

The ways that I have found to work around this:

  1. Have the program send a lot of messages to the destination to keep the route open. This is not really a good idea.
  2. Open an inbound port in the firewall and change the client program to always use that port when it executes. I have verified that this works on my machine. The port can either be configured as opened either by manual configuration or programmatically. In either event there is a requirement for administrative privileged. At this point I do not believe that it would make sense to provide a function in the CoAP library to perform this operation, but I could be convinced.
dominionish commented 7 years ago

Yes, you right. The problem was in the firewall. I test some issue and one issue solved my problem. I did this: image So when you set port number in this case on start application firewall ask you whould you loke to get access to this application. You click yes button and all will work. But when you use port 0 - UDPConnection choose it by itselph and there is no firewall message about access. So as variant can you please create Config parameter like "port" and use this parameter as in screenshot above. And When user didnt set this port only in this case use port 0. So we will be able to choose static port as we wish. I'll test this solution a week may be it has some problems. I'll let you know

jimsch commented 7 years ago

There is an example of creating and defining and endpoint in the example code for dealing with coaps already.

The code for you looks like below where you create a specific endpoint and then set the endpoint on the request.

       CoAPEndPoint ep = null;
            ep = new CoAPEndPoint(40001);
            ep.Start();
            request.EndPoint = ep;

        request.URI = uri;
dominionish commented 7 years ago

Cool! I didnt know. Thank you!

dominionish commented 7 years ago

And I found another problem.

  1. I started sensortag
  2. Make Get request - all is fine. I receive response
  3. I switch off sensortag or make him sleeping
  4. Make Get request - nothing happen and in some seconds unlimit polling come! As the result I get unlimit timeout errors: image

It should be polling only 4 times and ther should only 1 timeout error and thats all.

jimsch commented 7 years ago

It would be easier for me if you attached the wireshark log rather than the picture so that I can look at all of the messages and see what is tied to what My first guess is going to be that the freshness code got turned back on but I am not sure that the timeouts look correct for that.

dominionish commented 7 years ago

Here is all kind of logs from wireshark: I didnt know what format do you need so i created 3 different logs: https://drive.google.com/file/d/0B322t_kof_jkUGc3UkhUMzJyOEE/view?usp=sharing

dominionish commented 7 years ago

Sory, it's my fault. Bug was in my code. All is working fine!

jimsch commented 7 years ago

I think this is all fixed