telefonicaid / fiware-orion

Context Broker and CEF building block for context data management, providing NGSI interfaces.
https://github.com/telefonicaid/fiware-orion/blob/master/doc/manuals/orion-api.md
GNU Affero General Public License v3.0
212 stars 265 forks source link

Service name (tenant) with different case kill the process #431

Open XavierVal opened 10 years ago

XavierVal commented 10 years ago

Settings

CB is running with the extra params:

BROKER_EXTRA_OPS="-multiservice header -t 0-255"

[root@integrationtest fi-ware-pep-proxy]# ps -efa | grep context

root 12587 12573 0 16:48 pts/4 00:00:00 tail -f /var/log/contextBroker/contextBrokerLog orion 12601 1 0 16:48 ? 00:00:00 /usr/bin/contextBroker -port 1026 -logDir /var/log/contextBroker -pidpath /var/log/contextBroker/contextBroker.pid -dbhost localhost -db orion -multiservice header -t 0-255

[root@integrationtest fi-ware-pep-proxy]# curl localhost:1026/version

<orion>
  <version>0.14.0_20140623105111</version>
  <uptime>0 d, 0 h, 0 m, 19 s</uptime>
  <git_hash>cd972a57702aacccff85e53d63b3b4bebce6a306</git_hash>
  <compile_time>nodate</compile_time>
  <compiled_by>develenv</compiled_by>
  <compiled_in>ci-fiware-01</compiled_in>
</orion>

MONGO DBs

show dbs

orion   0.203125GB
orion-IoT_permanentService  0.203125GB
orion-IoT_serviceX  0.203125GB

use orion-IoT_serviceX switched to db orion-IoT_serviceX db.entities.findOne()

{
    "_id" : {
        "id" : "IoTModelTest.IoTAsset",
        "type" : "device"
    },
    "attrs" : [
        {
            "name" : "TimeInstant",
            "type" : "urn:x-ogc:def:trs:IDAS:1.0:ISO8601",
            "creDate" : 1402996609,
            "value" : "2014-07-01T14:24:06Z",
            "modDate" : 1404224646
        },
        {
            "name" : "temperature",
            "type" : "Quantity",
            "creDate" : 1402996609,
            "value" : "-11.1",
            "md" : [
                {
                    "name" : "uom",
                    "type" : "string",
                    "value" : "celsius"
                },
                {
                    "name" : "phenomenon",
                    "type" : "string",
                    "value" : "urn:x-ogc:def:phenomenon:IDAS:1.0:temperature"
                },
                {
                    "name" : "TimeInstant",
                    "type" : "urn:x-ogc:def:trs:IDAS:1.0:ISO8601",
                    "value" : "2014-07-01T14:24:06Z"
                }
            ],
            "modDate" : 1404224646
        },
        {
            "name" : "device",
            "type" : "string",
            "creDate" : 1402996609,
            "value" : "IoTAsset",
            "modDate" : 1402996609
        },
        {
            "name" : "lastIP",
            "type" : "string",
            "creDate" : 1402996609,
            "value" : "10.95.233.4",
            "modDate" : 1404224640
        },
        {
            "name" : "status",
            "type" : "string",
            "creDate" : 1402996609,
            "value" : "Active",
            "modDate" : 1402996609
        }
    ],
    "creDate" : 1402996609,
    "modDate" : 1404224646
}

Time to ask for the entities

If i ask with the proper name of the service: [root@integrationtest fi-ware-pep-proxy]# curl -H 'Accept: application/json' -H 'Fiware-Service: IoT_serviceX' localhost:1026/NGSI10/contextEntityTypes/

{
  "contextResponses" : [
    {
      "contextElement" : {
        "type" : "device",
        "isPattern" : "false",
        "id" : "IoTModelTest.IoTAsset",
        "attributes" : [
          {
            "name" : "TimeInstant",
            "type" : "urn:x-ogc:def:trs:IDAS:1.0:ISO8601",
            "value" : "2014-07-01T14:24:06Z"
          },
          {
            "name" : "temperature",
            "type" : "Quantity",
            "value" : "-11.1",
            "metadatas" : [
              {
                "name" : "uom",
                "type" : "string",
                "value" : "celsius"
              },
              {
                "name" : "phenomenon",
                "type" : "string",
                "value" : "urn:x-ogc:def:phenomenon:IDAS:1.0:temperature"
              },
              {
                "name" : "TimeInstant",
                "type" : "urn:x-ogc:def:trs:IDAS:1.0:ISO8601",
                "value" : "2014-07-01T14:24:06Z"
              }
            ]
          },
          {
            "name" : "device",
            "type" : "string",
            "value" : "IoTAsset"
          },
          {
            "name" : "lastIP",
            "type" : "string",
            "value" : "10.95.233.4"
          },
          {
            "name" : "status",
            "type" : "string",
            "value" : "Active"
          }
        ]
      },
      "statusCode" : {
        "code" : "200",
        "reasonPhrase" : "OK"
      }
    }
  ]
}

But if "accidentally" I mistake the cases...

[root@integrationtest fi-ware-pep-proxy]# curl -H 'Accept: application/json' -H 'Fiware-Service: IoT_ServiceX' localhost:1026/NGSI10/contextEntityTypes/

curl: (52) Empty reply from server

[root@integrationtest fi-ware-pep-proxy]# ps -efa | grep context

root 12805 9183 0 17:08 pts/1 00:00:00 grep context

[root@integrationtest fi-ware-pep-proxy]# service contextBroker status

contextBroker dead but pid file exists

Last Logged info

T:Tuesday 01 Jul 15:30:58 2014(899):contextBroker-/MongoGlobal.cpp[877] entitiesQuery: retrieved document: '{ $err: "db already exists with different case other: [orion-IoT_serviceX] me [orion-IoT_ServiceX]", code: 13297 }'

Effort: 5 man day (to allow for case-sensitive database names)

kzangeli commented 10 years ago

So, I tried this and yes, the broker really dies when the case of the tenant name is wrong. Pretty serious issue, why just a 6 ?

Now, I looked into the traces just before the broker dies and I got some pretty interesting result.

MongoGlobal.cpp, function registrationsQuery, line 1222 (will change). Code snippet:

/* Process query result */
while (cursor->more())
{
    BSONObj r = cursor->next();
    LM_T(LmtMongo, ("retrieved document: '%s'", r.toString().c_str()));
    std::vector<BSONElement> queryContextRegistrationV = r.getField(REG_CONTEXT_REGISTRATION).Array();
    for (unsigned int ix = 0 ; ix < queryContextRegistrationV.size(); ++ix)
    {
        processContextRegistrationElement(queryContextRegistrationV[ix].embeddedObject(), enV, attrL, crrV);
    }

The process dies on the line 1222, which is:

std::vector<BSONElement> queryContextRegistrationV = r.getField(REG_CONTEXT_REGISTRATION).Array();

The log line preceding this line says this (this is the interesting part):

msg=retrieved document: '{ $err: "db already exists with different case other: [orion-T1] me [orion-t1]", code: 13297 }'

So, we have the info to avoid the crash and to return a 'not found'. I just don't know how to extract it (Approach No 1) ...

Another way to solve this problem would be to configure mongo to be "more" case sensitive so that it simply can't find the database as another case is used. This approach may give us more headache that the first approach.

Or, make mongo completely insensible to case, like Windows ...

kzangeli commented 10 years ago

I was looking a little at case sensitiveness in mongo database names. Found this: https://jira.mongodb.org/browse/DOCS-1986

So, it doesn't seem configurable.

As I see it, we only have two options left to fix this issue:

  1. In a previous layer (preferably rest layer) we 'downcase' the names of tenants, thus making the tenants case insensitive, like Windows.
  2. We catch these errors in mongBackend and return Not Found when this 'db already exists with different case' occurs.

Approach 1 is very easy to implement while it is not ideal. Approach 2 is harder to implement but it is the ideal solution.

In approach 2 we need to consider the case of update/APPEND, a new tenant/database is to be created when this happens. Will this be possible? Mongo states the database name is case insensitive, yet they behave like this ... Pretty strange in my opinion.

kzangeli commented 10 years ago

OK, some more info.

MongoDB does not permit database names that differ only by the case of the characters. Found in http://docs.mongodb.org/manual/reference/limits/

So, there is nothing we can do. Approach 2 must be changed to simply return an error. I vote for approach 1

kzangeli commented 10 years ago

Fermin had an idea about how to solve this problem. Mongo forces us not to mix upper/lower in the database name, so we'll just use all lowercase in the database name. The "metadata information" about upper/lower-case in the tenant we can still store in the database name, just transforming it into something else. For example a checksum of the original (real) tenant name.

Something like this: Incoming request for an update/Append for tenant Valencia: The name of the database would be "orion" + lowercase("Valencia") + "" + checksum("Valencia"):

"orionvalencia[0-9]*"

The checksum method would have to be smart enough to not confuse "Valencia" with "VALencia", etc.

This solves the problem. Only problem (not a real problem) with this idea is that we might end up with two tenants/databases with almost identical names and we don't know which is which ... (e.g. orion_valencia_12345 and orion_valencia_34567).

fgalan commented 10 years ago

Being precise, it should be:

"orion" + lowercase("Valencia") + "" + lowercase(checksum("Valencia")).

However, if checksum() returns always a number, as you suggest, then this comment is irrelevant :)

kzangeli commented 10 years ago

A problem we'd have to attack is to make the function that recovers the ONTIMEINTERVAL threads when the orion context broker starts.

We need a mapping back from 'orion_valencia_12345' to 'Valencia', its real tenant name.

fgalan commented 10 years ago

The rationale (you can agree or not but it is a fact :) regarding why MongoDB doesn't support this is the following (http://stackoverflow.com/questions/20126378/db-already-exists-with-different-case-other):

The database name is used in naming the data extent files, so clashes in name could cause Bad Things to happen on case-insensitive file systems.

fgalan commented 10 years ago

Re-thinking on this, maybe the best idea is to have some "tenant" collection in the database associated to the default tenant ("orion" by default). This collection could content a document per tenant, with the mapping information (tenant name case sensitive to actual database name in MongoDB).

It could be useful in the future if we need to expand the properties of a tenant. E.g. a quota associated to each tenant, etc.

fgalan commented 10 years ago

A workaround for this issues is provide in PR #457. With that workaround, CB doesn't crash any more due to this problem.

fgalan commented 10 years ago

Sorry, closed by error. Re-openend.