telefonicaid / fiware-orion

Context Broker and CEF building block for context data management, providing NGSI interfaces.
https://fiware-orion.rtfd.io/
GNU Affero General Public License v3.0
212 stars 265 forks source link

[ISSUE] particular case of queryContext request. #3119

Closed saurabhjangir closed 6 years ago

saurabhjangir commented 6 years ago

When a queryContext request which including utf-8 "\0" code is sent, a Orion process is downed. A sample of sent queryContext is below.

{
  "entitites": [
    {
      "type": "Street",
      "isPattern": "true",
      "id": "\u0000"
    }
  ],
  "attributes": [
    "temperature"
  ]
}

Is this an expected behavior ?

fgalan commented 6 years ago

What do you mean by "downed"? Orion crashes?

saurabhjangir commented 6 years ago

Yes. Please find the request and response below. Program is crashing after this request.

[root@localhost ~]# (curl localhost:1026/v1/queryContext -s -S --header 'Content-Type: application/json' \
>     --header 'Accept: application/json' -d @- | python -mjson.tool) <<EOF
> {
>     "entities": [
>         {
>             "type": "Room",
>             "isPattern": "true",
>             "id": "\u0000"
>         }
>     ]
> }
> EOF
curl: (52) Empty reply from server
No JSON object could be decoded

In case of isPattern is false in the same request we are getting appropriate error message as below,

[root@localhost fiware-orion]# (curl localhost:1026/v1/queryContext -s -S --header 'Content-Type: application/json' \
>     --header 'Accept: application/json' -d @- | python -mjson.tool) <<EOF
> {
>     "entities": [
>         {
>             "type": "Room",
>             "isPattern": "false",
>             "id": "\u0000"
>         }
>     ]
> }
> EOF
{
    "errorCode": {
        "code": "404",
        "reasonPhrase": "No context element found"
    }
}
fgalan commented 6 years ago

So, it seems to be a bug... :)

I would be great if you could point out the point of code in which Orion crashes. Set core genreation (ulimit -c unlimited), then make it crash to get the core, then use the core with gbd and do a backtrace with bt will provide that information.

kzangeli commented 6 years ago

Seems clear we have a problem :-)

Now, a few doubts, The first payload contains the tag "entitites" ... is this a typo only in the issue? (entitites ... an extra 't' there).

Second doubt, in the two payloads with id '0, the first one is "id": "\u0000" and the second one is without the 'u': "\0000". Is this how it should be? Typo in issue? Is it really the isPattern that makes one crash and the other one not? Or is the the difference in the value of "id" ?

I believe the json parser detects \0 and flags it as invalid char in json. \u0000 on the other hand we should probably accepot as a ZERO ... Right?

fgalan commented 6 years ago

In case of isPattern is false in the same request we are getting appropriate error message as below,

Are really the same request? Your post shows "id": "\u0000" and "id": "\0000" in the other.

I'm asking because I'm not sure if that is a typo or an actual difference...

saurabhjangir commented 6 years ago

Let me share my investigation regarding this issue. When we make isPattern true, It will look for a regex pattern in ID parameter and according to our input we are providing null bytes in regex which i think is not handled in the code. So, through gdb, I got the Issue as [New Thread 0x7f9906ffd700 (LWP 22870)] terminate called after throwing an instance of 'mongo::UserException' what(): regex cannot contain null bytes which clearly indicates that the issue is because we're using null bytes in regex. Please find the bt of the request.

Program received signal SIGABRT, Aborted.
[Switching to Thread 0x7f2d9a08f700 (LWP 23344)]
0x00007f2da2c001f7 in raise () from /lib64/libc.so.6
(gdb) bt
#0  0x00007f2da2c001f7 in raise () from /lib64/libc.so.6
#1  0x00007f2da2c018e8 in abort () from /lib64/libc.so.6
#2  0x00007f2da3506ac5 in __gnu_cxx::__verbose_terminate_handler() () from /lib64/libstdc++.so.6
#3  0x00007f2da3504a36 in ?? () from /lib64/libstdc++.so.6
#4  0x00007f2da3504a63 in std::terminate() () from /lib64/libstdc++.so.6
#5  0x00007f2da3504c83 in __cxa_throw () from /lib64/libstdc++.so.6
#6  0x00000000008cf807 in mongo::uasserted (msgid=0, msg=0x94cbf0 "regex cannot contain null bytes") at src/mongo/util/assert_util.cpp:139
#7  0x00000000007a1089 in mongo::BSONObjBuilder::appendRegex (this=0x7f2d9a084c00, fieldName=..., regex=..., options=...)
    at /usr/local/include/mongo/bson/bsonobjbuilder.h:386
#8  0x00000000007945b7 in fillQueryEntity (baP=0x7f2d9a085160, enP=0x7f2d90001070) at /root/fiware-orion/src/lib/mongoBackend/MongoGlobal.cpp:789
#9  0x0000000000798aad in entitiesQuery (enV=..., attrL=..., metadataList=..., res=..., cerV=0x7f2d9a0853e0, err=0x7f2d9a085330, includeEmpty=true,
    tenant="", servicePath=std::vector of length 1, capacity 1 = {...}, offset=0, limit=20, limitReached=0x7f2d9a0852f5, countP=0x0,
    sortOrderList="", apiVersion=V1) at /root/fiware-orion/src/lib/mongoBackend/MongoGlobal.cpp:1242
#10 0x00000000007a959b in mongoQueryContext (requestP=0x7f2d9a085c18, responseP=0x7f2d9a086198, tenant="",
    servicePathV=std::vector of length 1, capacity 1 = {...}, uriParams=std::map with 7 elements = {...}, options=std::map with 2 elements = {...},
    countP=0x0, apiVersion=V1) at /root/fiware-orion/src/lib/mongoBackend/mongoQueryContext.cpp:379
#11 0x00000000006648e4 in postQueryContext (ciP=0x7f2d90000960, components=2, compV=std::vector of length 2, capacity 2 = {...},
    parseDataP=0x7f2d9a085a20) at /root/fiware-orion/src/lib/serviceRoutines/postQueryContext.cpp:359
#12 0x00000000006ba493 in restService (ciP=0x7f2d90000960, serviceV=0xc2d920 <restServiceV>) at /root/fiware-orion/src/lib/rest/RestService.cpp:623
#13 0x00000000006ae28b in serve (ciP=0x7f2d90000960) at /root/fiware-orion/src/lib/rest/rest.cpp:564
#14 0x00000000006b275a in connectionTreat (cls=0x0, connection=0x7f2d94000d60, url=0x7f2da516a005 "/v1/queryContext", method=0x7f2da516a000 "POST",
    version=0x7f2da516a016 "HTTP/1.1", upload_data=0x0, upload_data_size=0x7f2d9a086998, con_cls=0x7f2d94000db8)
    at /root/fiware-orion/src/lib/rest/rest.cpp:1574
#15 0x0000000000849129 in call_connection_handler (connection=connection@entry=0x7f2d94000d60) at connection.c:1585
#16 0x000000000084a148 in MHD_connection_handle_idle (connection=0x7f2d94000d60) at connection.c:2624
#17 0x000000000084c9fd in MHD_handle_connection (data=0x7f2d94000d60) at daemon.c:998
#18 0x00007f2da3eade25 in start_thread () from /lib64/libpthread.so.0
#19 0x00007f2da2cc334d in clone () from /lib64/libc.so.6

Please provide your inputs regarding the same.

saurabhjangir commented 6 years ago

@fgalan and @kzangeli, That was not "\0000" but "\u0000". I have updated my previous comment, please check it again to check updated response to that request.

fgalan commented 6 years ago

@saurabhjangir thanks for the investigation work. The regex cannot contain null bytes message is a very good clue.

Do you want to fix this (as you did in the past in a similar case)? The one who discover the bug is the first one that could claim for the honor and glory of solving it ;)

saurabhjangir commented 6 years ago

@fgalan Thank you :) It would be my pleasure to solve this issue. I'll do it and provide my patch as soon as possible.

saurabhjangir commented 6 years ago

There is a part of code,

    /** Append a regular expression value
        @param regex the regular expression pattern
        @param regex options such as "i" or "g"
    */
    BSONObjBuilder& appendRegex(const StringData& fieldName,
                                const StringData& regex,
                                const StringData& options = "") {
        checkFieldName(fieldName);
        uassert(0, "regex cannot contain null bytes", regex.find('\0') == std::string::npos);
        _b.appendNum((char)RegEx);
        _b.appendStr(fieldName);
        _b.appendStr(regex);
        _b.appendStr(options);
        return *this;
    }

Could you please tell me how uassert works ? I think this statement is the reason of program termination, can we use anything else instead of it ?

fgalan commented 6 years ago

Where do that code comes from? Mongo driver?

Not sure about uassert() but I think its purpose is to check a given condition, in particular this one:

regex.find('\0') == std::string::npos

If the condition is not true, then uassert() makes the program to end (and this is the reason for the crash in Orion). It is typically used to prevent a worse problem if the program would continue.

However, thit's a good hint to create a fix in the Orion context broker: just implement that checking in the parsing layer for the affected case (regex sensible fields in the input request payloads) and the problem would be fixed.

saurabhjangir commented 6 years ago

I have gone through the code and below is my complete investigation.

  1. This issue is occuring due the line uassert(0, "regex cannot contain null bytes", regex.find('\0') == std::string::npos); in mongodb library code. Point is the assertion is raised from mongodb code and not from fiware orion code. File name: /usr/local/include/mongo/bson/bsonobjbuilder.h, line number 385.

  2. Assertion means that program will terminate if something wrong has happened. and uassert means - uassert is user assertion which means that if program asserts, then user did something wrong, not the code. The above is clearly mentioned in the code , you can find it at "/usr/local/include/mongo/util/assert_util.h" line number 355 which is again a part of mongodb library. /* "user assert". if asserts, user did something wrong, not our code */

so from this I've concluded that there is nothing to change from fiware orion code and also user assertion is raised from mongo code. User need to take care of inputs.

kzangeli commented 6 years ago

Well, we cannot let users send messages (even malign ones) to the broker resulting in a crash of the broker. The broker needs to protect itself against this situation. This is done by doing validity checks of the input before accepting it, just as @fgalan proposes. Should be fairly easy ...

fgalan commented 6 years ago

Fixed by PR #3124

saurabhjangir commented 6 years ago

@fgalan please track its fix at PR #3125

saurabhjangir commented 6 years ago

Fixed by PR #3125

fgalan commented 6 years ago

PR #3125 was reverted and this issue re-opened. No worries about that, sometimes it happens :)

For the next attemp, please have a look to the failing cases in https://github.com/telefonicaid/fiware-orion/pull/3125#issuecomment-372637077. Maybe the cause of the problem is that condition added to check for entity id in the case of patterns is to strict and it is detecting some false positive.

As additional hints: ensure the contextBroker binary has been compiled from your branch and that the binary is in the path (so testHarness.sh can find it). Then, do a full pass of the regression suite (the ~990 tests), not only the one created/touched by your PR.

fgalan commented 6 years ago

New attemp (lest' hope final :)) to fix the issue in PR https://github.com/telefonicaid/fiware-orion/pull/3131

fgalan commented 6 years ago

...and a final little fix in PR https://github.com/telefonicaid/fiware-orion/pull/3133

Closing

kzangeli commented 6 years ago

So, I finally dedicated some time to this "enigma" ...

I added some debugging to src/lib/ngsi/EntityId.cpp:

+    char* idString = (char*) id.c_str();
+    int   slen     = strlen(idString);
+
+    LM_TMP(("id: '%s', sLen: %d. c0: 0x%x, c1: 0x%x, c2: 0x%x, c3: 0x%x", idString, slen,
+            idString[0] & 0xFF,
+            idString[1] & 0xFF,
+            idString[2] & 0xFF,
+            idString[3] & 0xFF));

Now, the functest with "id": "\u0000" gives this log output:

 op=EntityId.cpp[169]:check  msg=id: '', sLen: 0. c0: 0x0, c1: 0x0, c2: 0x0, c3: 0x0

sLen == 0 (not 5, as I expected ... I also expected those five chars to be 21-48-48-48-48),

So, somewhere, the \u0000 is translated into an empty string. By curl? mhd? Not a clue ...

I changed

  if ((id.find('\0') != std::string::npos) || (regcomp(&re, id.c_str(), REG_EXTENDED) != 0))

for

if ((idString[0] == 0) || (regcomp(&re, id.c_str(), REG_EXTENDED) != 0))

And it works, of course. The cryptic "lookup zero in string" is removed and ... in 3 asm instructions this condition is taken care of. The id,find() ... probably a hundred times more

First of all, using "\u0000" in a func test doesn't make any sense, "" should be used instead, to at least not introduce unnecessary noise.

Secondly, it would be good to be able to actually send a 5 char string `"\u0000" to the broker and see what happens ... This would have to be investigated.

fgalan commented 6 years ago

So, somewhere, the \u0000 is translated into an empty string

Maybe is related with Unicode in some way, but I don't know how... Anyway, if (idString[0] == 0) does the trick and full ft regression works with that change we can use it. However, let's be sure a very good explanatory comment is included in the place of the code where that is used... we need that kind of comments when cryptic stuff comes :)

@saurabhjangir what do you think?

saurabhjangir commented 6 years ago

@fgalan @kzangeli Hi :) I have gone through the approach used by @kzangeli. It's an good approach and I really appreciate the way you thought. :) But It will not clear all the inputs, like what if user will give "id": "R\u0000". According to if ((idString[0] == 0) || (regcomp(&re, id.c_str(), REG_EXTENDED) != 0)) check will fail and the user assert will be raised at mongodb library. Below is the output of such request after making these changes in code:

[root@localhost fiware-orion]# (curl localhost:1026/v1/queryContext -s -S --header 'Content-Type: application/json' \
>     --header 'Accept: application/json' -d @- | python -mjson.tool) <<EOF
> {
>     "entities": [
>         {
>             "type": "Room",
>             "isPattern": "true",
>             "id": "R\u0000"
>         }
>     ]
> }
> EOF
terminate called after throwing an instance of 'mongo::UserException'
  what():  regex cannot contain null bytes
curl: (52) Empty reply from server
No JSON object could be decoded

I think loop is must for the solution of the issue ;)

I will add few more such test cases in test file today. Just need some time. :)

saurabhjangir commented 6 years ago

@fgalan The same test has been added in test file and PR #3134

kzangeli commented 6 years ago

"R\u0000" == "R" inside the broker. "id" =="R" is valid and should not give an error. Instead of looking inside service routines or mongo functions (where the "\u0000" has disappeared, we should trt to find where/why it is replaced by a 0)

saurabhjangir commented 6 years ago

"R\u0000" --> In this \u0000 has some meaning in JSON and is a valid NULL character and should not be ignored by the broker. It should not be treated as "R".

kzangeli commented 6 years ago

Some layer somewhere changes the "\u0000" to '\0'. The broker sees "R", simply ... As I said before, we need to find where this conversion is made. Assuming it's not curl, perhaps one of the libraries the broker uses is not correctly configured ?