PelionIoT / mbed-cloud-sdk-javascript

Pelion Device Management SDK for JavaScript
Apache License 2.0
9 stars 10 forks source link

Can't get a resource value when a cached value is returned. #234

Closed coisme closed 6 years ago

coisme commented 6 years ago

Issue

Can't get a resource value using getResourceValue() right after the resource value changed and cached value is returned.

Steps to reproduce

Tested Environment

* K64F
* macOS High Sierra Version 10.13.6
* node v8.9.4
* mbed-cloud-sdk@2.0.0

Here are steps to reproduce this issue.

API Key

Get an API key from Pelion Portal.

Set up a board running QuickStart firmware

Prepare a board which runs the firmware provided in QuickStart.

Get a device ID (endpoint name) from device's serial output or Device List in Pelion Portal.

Test program

I created a test program to reproduce this issue. Please download and set it up. https://github.com/coisme/Pelion-NodeJS-Test-1

This test program calls getResourceValue() every 5 seconds and get the resource value.

Example Program Output

The button on the board was pushed once between 12:07:48 and 12:07:52. Data btnCnt is missing for a while after this.

$ node index.js
12:07:32, Call, Num=1
12:07:34,  Ret, Num=1, btnCnt=15
12:07:37, Call, Num=2
12:07:38,  Ret, Num=2, btnCnt=15
12:07:42, Call, Num=3
12:07:43,  Ret, Num=3, btnCnt=15
12:07:47, Call, Num=4
12:07:48,  Ret, Num=4, btnCnt=15
12:07:52, Call, Num=5
12:07:53,  Ret, Num=5, btnCnt={}
12:07:57, Call, Num=6
12:07:58,  Ret, Num=6, btnCnt={}
12:08:02, Call, Num=7
12:08:03,  Ret, Num=7, btnCnt={}
12:08:07, Call, Num=8
12:08:08,  Ret, Num=8, btnCnt={}
12:08:12, Call, Num=9
12:08:13,  Ret, Num=9, btnCnt={}
12:08:17, Call, Num=10
12:08:18,  Ret, Num=10, btnCnt={}
12:08:22, Call, Num=11
12:08:23,  Ret, Num=11, btnCnt={}
12:08:27, Call, Num=12
12:08:28,  Ret, Num=12, btnCnt={}
12:08:32, Call, Num=13
12:08:33,  Ret, Num=13, btnCnt={}
12:08:37, Call, Num=14
12:08:38,  Ret, Num=14, btnCnt={}
12:08:42, Call, Num=15
12:08:43,  Ret, Num=15, btnCnt={}
12:08:47, Call, Num=16
12:08:48,  Ret, Num=16, btnCnt={}
12:08:52, Call, Num=17
12:08:53,  Ret, Num=17, btnCnt=16
12:08:57, Call, Num=18
12:08:58,  Ret, Num=18, btnCnt=16
12:09:02, Call, Num=19

Additional Information

Python SDK's behavior

I created the same program with the latest Python SDK. This program works well even if the button is pushed. https://github.com/coisme/Pelion-Python-Test-1

Example Output

The button on the board was pushed three times between 12:33:03 and 12:33:08 but the return value is handled correctly.

$ python pull-notification.py
12:32:53 btn=16
12:32:58 btn=16
12:33:03 btn=16
12:33:08 btn=19
12:33:13 btn=19
12:33:18 btn=19

Calls Service API directly

In addition, I tried using Service API directly. Here's my script: https://github.com/coisme/Perion-API-Test-1 (Sorry, this repo is deleted.) https://github.com/coisme/Pelion-ConnectAPI-Test-1

According to my experiment with this script, when in stable state, GET /v2/endpoints/{device-id}/{resourcePath} returns status code 202 and AsyncId. We can get the resource value from notification channel. However, when the button on the board is pushed, the API returns status code 200 and cached value for a while. The cached value is returned by the response's body and no data is coming from notification channel. If you call GET /v2/notification/pull in this context, it will get timeout.

Is this handled in Javascript SDK properly?

Javascript SDK v1.2.2 works well

If I use Javascript SDK v1.2.2, which is used by quick start, this issue doesn't happen. https://github.com/coisme/Pelion-NodeJS-Test-1/tree/use-v1.2.2 If I update the module from v1.2.2 to v1.2.3, this issue occurs. It indicates something was injected between v1.2.2 and v1.2.3...

coisme commented 6 years ago

Related to JIRA IOTSTOOL-1661

alexl0gan commented 6 years ago

The fact that your example worked in 1.2.2 and not 1.2.3 must have been a fluke because the caching issue was still unknown during this period, and was addressed in 1.2.5. No changes were made to getResourceValue between these releases.

Looking at the logs you've provided it suggests that the JS SDK isn't handling timeouts correctly and resolving the promise for getResourceValue with no data. I will try to reproduce fully as soon as I can.

madchutney commented 6 years ago

Thank you for the analysis. Your observation on the async response was a problem we had with earlier versions of the SDKs as the behaviour wasn't originally documented and only discovered by accident during our own testing. From v1.2.6.852 of the Python SDK and v 1.2.7 of the JavaScript SDK the SDKs use a newer endpoint /v2/device-requests/{device-id} to obtain resource values.

coisme commented 6 years ago

@madchutney Thank you for your comment. In the current master branch of JS SDK, ConnectAPI.getResourceValue() calls ResourcesAPI.getResourceValue(). In this function, it seems that /v2/endpoints/{device-id}/{resourcePath} is used. https://github.com/ARMmbed/mbed-cloud-sdk-javascript/blob/f05738b5e6aa9151041d00ab4c66b6937c293e1d/src/_api/mds.ts#L753-L754 I guess that to use /v2/device-requests/{device-id}, DeviceRequestsAPI.createAsyncRequest() should be called. However, it seems not called, when I checked my program step by step with debugger.

alexl0gan commented 6 years ago

I ran your example with one of our devices and got the following output

09:41:58, Call, Num=1
09:41:59,  Ret, Num=1, btnCnt=null
09:42:03, Call, Num=2
09:42:04,  Ret, Num=2, btnCnt=null
09:42:08, Call, Num=3
09:42:09,  Ret, Num=3, btnCnt=null
09:42:13, Call, Num=4
09:42:14,  Ret, Num=4, btnCnt=null
09:42:18, Call, Num=5
09:42:19,  Ret, Num=5, btnCnt=null
09:42:23, Call, Num=6
09:42:24,  Ret, Num=6, btnCnt=null
09:42:28, Call, Num=7
09:42:29,  Ret, Num=7, btnCnt=null
09:42:33, Call, Num=8
09:42:34,  Ret, Num=8, btnCnt=null
09:42:38, Call, Num=9
09:42:39,  Ret, Num=9, btnCnt=null
09:42:43, Call, Num=10
09:42:44,  Ret, Num=10, btnCnt=null
09:42:48, Call, Num=11
09:42:49,  Ret, Num=11, btnCnt=null
09:42:53, Call, Num=12
09:42:54,  Ret, Num=12, btnCnt=null
09:42:58, Call, Num=13
09:42:59,  Ret, Num=13, btnCnt=1
09:43:03, Call, Num=14
09:43:04,  Ret, Num=14, btnCnt=1
09:43:08, Call, Num=15
09:43:09,  Ret, Num=15, btnCnt=1
09:43:13, Call, Num=16
09:43:14,  Ret, Num=16, btnCnt=1
09:43:18, Call, Num=17
09:43:19,  Ret, Num=17, btnCnt=1
09:43:23, Call, Num=18
09:43:24,  Ret, Num=18, btnCnt=1
09:43:28, Call, Num=19
09:43:29,  Ret, Num=19, btnCnt=1
09:43:33, Call, Num=20
09:43:34,  Ret, Num=20, btnCnt=1
09:43:38, Call, Num=21
09:43:39,  Ret, Num=21, btnCnt=1
09:43:43, Call, Num=22
09:43:44,  Ret, Num=22, btnCnt=1
09:43:48, Call, Num=23
09:43:49,  Ret, Num=23, btnCnt=2
09:43:53, Call, Num=24
09:43:54,  Ret, Num=24, btnCnt=2
09:43:58, Call, Num=25
09:44:00,  Ret, Num=25, btnCnt=2
09:44:05, Call, Num=26
09:44:05,  Ret, Num=26, btnCnt=2
09:44:10, Call, Num=27
09:44:10,  Ret, Num=27, btnCnt=2
09:44:15, Call, Num=28
09:44:15,  Ret, Num=28, btnCnt=2
09:44:20, Call, Num=29
09:44:20,  Ret, Num=29, btnCnt=2
09:44:25, Call, Num=30
09:44:25,  Ret, Num=30, btnCnt=2
09:44:30, Call, Num=31
09:44:30,  Ret, Num=31, btnCnt=2
09:44:35, Call, Num=32
09:44:35,  Ret, Num=32, btnCnt=2
09:44:40, Call, Num=33
09:44:40,  Ret, Num=33, btnCnt=2
09:44:45, Call, Num=34
09:44:45,  Ret, Num=34, btnCnt=2
09:44:50, Call, Num=35
09:44:50,  Ret, Num=35, btnCnt=2
09:44:55, Call, Num=36
09:44:55,  Ret, Num=36, btnCnt=2
09:45:00, Call, Num=37
09:45:00,  Ret, Num=37, btnCnt=2
09:45:05, Call, Num=38
09:45:05,  Ret, Num=38, btnCnt=2
09:45:10, Call, Num=39
09:45:10,  Ret, Num=39, btnCnt=6
09:45:15, Call, Num=40
09:45:15,  Ret, Num=40, btnCnt=6
09:45:20, Call, Num=41
09:45:20,  Ret, Num=41, btnCnt=6
09:45:25, Call, Num=42
09:45:25,  Ret, Num=42, btnCnt=6
09:45:30, Call, Num=43
09:45:30,  Ret, Num=43, btnCnt=6
09:45:35, Call, Num=44
09:45:35,  Ret, Num=44, btnCnt=6
09:45:40, Call, Num=45
09:45:40,  Ret, Num=45, btnCnt=6
09:45:45, Call, Num=46
09:45:45,  Ret, Num=46, btnCnt=6
09:45:50, Call, Num=47
09:45:50,  Ret, Num=47, btnCnt=6
09:45:55, Call, Num=48
09:45:55,  Ret, Num=48, btnCnt=6
09:46:00, Call, Num=49
09:46:00,  Ret, Num=49, btnCnt=6
09:46:05, Call, Num=50
09:46:05,  Ret, Num=50, btnCnt=6
09:46:10, Call, Num=51
09:46:10,  Ret, Num=51, btnCnt=6
09:46:15, Call, Num=52
09:46:15,  Ret, Num=52, btnCnt=6
09:46:20, Call, Num=53
09:46:20,  Ret, Num=53, btnCnt=6
09:46:25, Call, Num=54
09:46:25,  Ret, Num=54, btnCnt=6
09:46:30, Call, Num=55
09:46:30,  Ret, Num=55, btnCnt=6
09:46:35, Call, Num=56
09:46:35,  Ret, Num=56, btnCnt=6
09:46:40, Call, Num=57
09:46:40,  Ret, Num=57, btnCnt=6
09:46:45, Call, Num=58
09:46:45,  Ret, Num=58, btnCnt=6
09:46:50, Call, Num=59
09:46:50,  Ret, Num=59, btnCnt=6
09:46:55, Call, Num=60
09:46:55,  Ret, Num=60, btnCnt=6
09:47:00, Call, Num=61
09:47:00,  Ret, Num=61, btnCnt=6
09:47:05, Call, Num=62
09:47:05,  Ret, Num=62, btnCnt=6
09:47:10, Call, Num=63
09:47:10,  Ret, Num=63, btnCnt=6
09:47:15, Call, Num=64
09:47:15,  Ret, Num=64, btnCnt=6
09:47:20, Call, Num=65
09:47:20,  Ret, Num=65, btnCnt=6
09:47:25, Call, Num=66
09:47:25,  Ret, Num=66, btnCnt=6
09:47:30, Call, Num=67
09:47:30,  Ret, Num=67, btnCnt=6
09:47:35, Call, Num=68
09:47:35,  Ret, Num=68, btnCnt=6
09:47:40, Call, Num=69
09:47:40,  Ret, Num=69, btnCnt=6
09:47:45, Call, Num=70
09:47:45,  Ret, Num=70, btnCnt=8
09:47:50, Call, Num=71
09:47:50,  Ret, Num=71, btnCnt=8
09:47:55, Call, Num=72
09:47:55,  Ret, Num=72, btnCnt=8
09:48:00, Call, Num=73
09:48:00,  Ret, Num=73, btnCnt=8
09:48:05, Call, Num=74
09:48:05,  Ret, Num=74, btnCnt=8
09:48:10, Call, Num=75
09:48:10,  Ret, Num=75, btnCnt=8
09:48:15, Call, Num=76
09:48:15,  Ret, Num=76, btnCnt=8
09:48:20, Call, Num=77
09:48:20,  Ret, Num=77, btnCnt=8

@coisme Are you still able to reproduce this?

coisme commented 6 years ago

Hi @alexl0gan , yes, it still reproduces on my environment.

$ node index.js
19:19:58, Call, Num=1
19:20:01,  Ret, Num=1, btnCnt=0
19:20:03, Call, Num=2
19:20:05,  Ret, Num=2, btnCnt=0
19:20:08, Call, Num=3
19:20:11,  Ret, Num=3, btnCnt=0
19:20:13, Call, Num=4
19:20:15,  Ret, Num=4, btnCnt=0
19:20:19, Call, Num=5
19:20:20,  Ret, Num=5, btnCnt=0
19:20:24, Call, Num=6
19:20:25,  Ret, Num=6, btnCnt=0
19:20:29, Call, Num=7
19:20:30,  Ret, Num=7, btnCnt={}
19:20:34, Call, Num=8
19:20:35,  Ret, Num=8, btnCnt={}
19:20:39, Call, Num=9
19:20:40,  Ret, Num=9, btnCnt={}
19:20:44, Call, Num=10
19:20:45,  Ret, Num=10, btnCnt={}
19:20:49, Call, Num=11
19:20:50,  Ret, Num=11, btnCnt={}
19:20:54, Call, Num=12
19:20:55,  Ret, Num=12, btnCnt={}
19:20:59, Call, Num=13
19:21:00,  Ret, Num=13, btnCnt={}
19:21:04, Call, Num=14
19:21:05,  Ret, Num=14, btnCnt={}
19:21:09, Call, Num=15
19:21:10,  Ret, Num=15, btnCnt={}
19:21:14, Call, Num=16
19:21:15,  Ret, Num=16, btnCnt={}
19:21:19, Call, Num=17
19:21:20,  Ret, Num=17, btnCnt={}
19:21:24, Call, Num=18
19:21:25,  Ret, Num=18, btnCnt={}
19:21:29, Call, Num=19
19:21:30,  Ret, Num=19, btnCnt=1
19:21:34, Call, Num=20
19:21:35,  Ret, Num=20, btnCnt=1
19:21:39, Call, Num=21
19:21:40,  Ret, Num=21, btnCnt=1
^C

Hmm.. What target board did you use? I use K64F...

alexl0gan commented 6 years ago

K64F as well.

Could you try capturing the http response that produces the btnCnt={}.

If I can mock the response that causes the {}, then we may find an issue. If we can't reproduce this then I'm struggling to see how else we can help. The SDK is only base64 decoding the response it gets from the notification channel.

coisme commented 6 years ago

Hi @alexl0gan , I used a debugger and put the breakpoint at line 22 in index.js. https://github.com/coisme/Pelion-NodeJS-Test-1/blob/f3840209f2e852a1da2388307616dd3b28073082/index.js#L22

When the button count is got successfully, ConnectAPI._pollRequest contains Response object and it has text field (highlighted blue in the picture below), which is JSON format result from the server.

screen shot 2018-09-05 at 21 10 47

On the other hand, right after the button on the board is pushed, i.e. shows{}, ConnectAPI._pollRequest doesn't contain Response object.

screen shot 2018-09-05 at 21 07 02

So getResourceValue() returns with no data.

If you need more info, let me know!

alexl0gan commented 6 years ago

Could you put a breakpoint in node_modules/mbed-cloud-sdk/lib/connect/connectApi.js line 259 so you can see what the response from the long poll call is.

coisme commented 6 years ago

The duration of {} is always about 1 minute. This is very close to the time length which server returns status code 200.

Here's the script I used, directly calls Service API: https://github.com/coisme/Pelion-ConnectAPI-Test-1

Output:

$ ./pull-notification.sh
09:42:58 btnCnt=7
09:43:07 btnCnt=7
09:43:14 btnCnt=7
09:43:22 btnCnt=7
09:43:29 btnCnt=8  (cached)
09:43:35 btnCnt=8  (cached)
09:43:42 btnCnt=8  (cached)
09:43:48 btnCnt=8  (cached)
09:43:55 btnCnt=8  (cached)
09:44:01 btnCnt=8  (cached)
09:44:09 btnCnt=8  (cached)
09:44:15 btnCnt=8  (cached)
09:44:21 btnCnt=8  (cached)
09:44:28 btnCnt=8
09:44:35^C
coisme commented 6 years ago

Hi @alexl0gan ,

I put a breakpoint at line 259 in connectApi.js, somehow it moved to line 341 in node_modules/mbed-cloud-sdk/src/connect/connectApi.ts.

Anyway, I could see the data object.

When the btnCnt is got normally, the data contains JSON format object as expected.

screen shot 2018-09-05 at 22 13 52

On the other hand, when we got {}, the object contains nothing.

screen shot 2018-09-05 at 22 15 16
coisme commented 6 years ago

See the internal bug tracker IOTSTOOL-1661. It seems that we shouldn’t use GET /v2/endpoints/\{device-id}/\{resourcePath}...

coisme commented 6 years ago

The issue resolved in 347cead57236325f318ec5978f0506201e19b3a1 ! Thank you, @alexl0gan