FIWARE / context.Orion-LD

Context Broker and CEF building block for context data management which supports both the NGSI-LD and the NGSI-v2 APIs
https://www.etsi.org/deliver/etsi_gs/CIM/001_099/009/01.06.01_60/gs_CIM009v010601p.pdf
GNU Affero General Public License v3.0
51 stars 43 forks source link

High resource usage while starting Orion-LD #1418

Open ttutuncu opened 1 year ago

ttutuncu commented 1 year ago

Hi,

We have a problem with our Orion-LD instance which when starting up it uses a high amount of CPU and RAM. What can cause this issue, is this normal? It usually exausts the RAM and causes an oomkilled.

We use it on a 2CPU 10GB RAM server and it exceeds these resources and it takes around 26 minutes to startup, sometimes failing with oomkilled. When it starts up it releases the memory to around 4-5GB. Something is happening between the "Connecting to mongo using the C driver (mongoc)" state and "Initialization ready - accepting REST requests on port 1026" state. I have shared the console log below. Is there any detailed log I can check and share with you?

time=2023-08-29T11:59:57.778Z | lvl=TMP | corr=N/A | trans=N/A | from=N/A | srv=N/A | subsrv=N/A | comp=Orion | op=orionld.cpp[737]:versionInfo | msg=Version Info:
time=2023-08-29T11:59:57.778Z | lvl=TMP | corr=N/A | trans=N/A | from=N/A | srv=N/A | subsrv=N/A | comp=Orion | op=orionld.cpp[738]:versionInfo | msg=-----------------------------------------
time=2023-08-29T11:59:57.778Z | lvl=TMP | corr=N/A | trans=N/A | from=N/A | srv=N/A | subsrv=N/A | comp=Orion | op=orionld.cpp[739]:versionInfo | msg=orionld version: 1.1.2
time=2023-08-29T11:59:57.778Z | lvl=TMP | corr=N/A | trans=N/A | from=N/A | srv=N/A | subsrv=N/A | comp=Orion | op=orionld.cpp[740]:versionInfo | msg=based on orion: 1.15.0-next
time=2023-08-29T11:59:57.778Z | lvl=TMP | corr=N/A | trans=N/A | from=N/A | srv=N/A | subsrv=N/A | comp=Orion | op=orionld.cpp[741]:versionInfo | msg=git hash: nogitversion
time=2023-08-29T11:59:57.778Z | lvl=TMP | corr=N/A | trans=N/A | from=N/A | srv=N/A | subsrv=N/A | comp=Orion | op=orionld.cpp[742]:versionInfo | msg=build branch: 
time=2023-08-29T11:59:57.778Z | lvl=TMP | corr=N/A | trans=N/A | from=N/A | srv=N/A | subsrv=N/A | comp=Orion | op=orionld.cpp[743]:versionInfo | msg=compiled by: root
time=2023-08-29T11:59:57.778Z | lvl=TMP | corr=N/A | trans=N/A | from=N/A | srv=N/A | subsrv=N/A | comp=Orion | op=orionld.cpp[744]:versionInfo | msg=compiled in: 
time=2023-08-29T11:59:57.778Z | lvl=TMP | corr=N/A | trans=N/A | from=N/A | srv=N/A | subsrv=N/A | comp=Orion | op=orionld.cpp[745]:versionInfo | msg=-----------------------------------------
NOTICE: extension "postgis" already exists, skipping
time=2023-08-29T11:59:57.838Z | lvl=TMP | corr=N/A | trans=N/A | from=N/A | srv=N/A | subsrv=N/A | comp=Orion | op=mongocInit.cpp[277]:mongocInit | msg=Connecting to mongo using the C driver (mongoc)
NOTICE: extension "postgis" already exists, skipping
time=2023-08-29T12:08:05.287Z | lvl=TMP | corr=N/A | trans=N/A | from=N/A | srv=N/A | subsrv=N/A | comp=Orion | op=mongoConnectionPool.cpp[320]:mongoConnectionPoolInit | msg=Connecting to mongo for the C++ legacy driver
time=2023-08-29T12:25:54.344Z | lvl=TMP | corr=N/A | trans=N/A | from=N/A | srv=N/A | subsrv=N/A | comp=Orion | op=orionld.cpp[1169]:main | msg=Initialization ready - accepting REST requests on port 1026 (experimental API endpoints are disabled)

Orion-LD: v1.1.2 (tried with 1.4 latest release using the same DB, same situation) MongoDB: v 3.6.23-13.0

Really appreciate your help, Thanks.

kzangeli commented 1 year ago

That is very weird. We start Orion-LD on one core and 4-5 Gb of RAM and it serves 5000 GETs per second without a problem. It starts in a few milliseconds.

So, something is definitely off here ... Please start your broker with ALL traces on and see if that helps us to understand what's happening:

orionld -logLevel DEBUG -t 0-255 (plus any parameters you're using)

Post the traces right here and I'll have a look

ttutuncu commented 1 year ago

Hi Ken, Thank you for your quick response. We opened the debug trace but we are not recieving any Debug logs into the console. Our deployment file is as follows:

apiVersion: apps/v1
kind: Deployment
metadata:
  name: orion-ld-cb
  namespace: fiware-new
  labels:
    app: orion-ld-cb
spec:
  replicas: 0
  selector:
    matchLabels:
      app: orion-ld-cb
  template:
    metadata:
      labels:
        app: orion-ld-cb
    spec:
      containers:
        - name: orion-ld-cb
          image: fiware/orion-ld:1.1.2
          ports:
            - containerPort: 1026
              protocol: TCP
          env:
            - name: ORIONLD_TROE
              value: 'TRUE'
            - name: ORIONLD_TROE_USER
              value: postgres
            - name: ORIONLD_TROE_PWD
              value: CM***************Zh6C
            - name: ORIONLD_TROE_HOST
              value: 192.168.40.115
            - name: ORIONLD_MULTI_SERVICE
              value: 'TRUE'
            - name: ORIONLD_DISABLE_FILE_LOG
              value: 'TRUE'
            - name: ORIONLD_TROE_POOL_SIZE
              value: '100'
            - name: ORIONLD_MONGO_HOST
              value: 192.168.40.110:27017,192.168.40.111:27017,192.168.40.113:27017
            - name: ORIONLD_MONGO_DB
              value: orion
            - name: ORIONLD_MONGO_REPLICA_SET
              value: rs0
            - name: ORIONLD_MONGO_USER
              value: default-admin
            - name: ORIONLD_MONGO_PASSWORD
              value: Sp****************jsY
            - name: ORIONLD_MONGO_AUTH_SOURCE
              value: admin
            - name: ORIONLD_LOG_LEVEL
              value: DEBUG
            - name: ORIONLD_TRACE
              value: 0-255

Why are we not seeing any debug logs? Is there anything else we can try?

Thanks.

kzangeli commented 1 year ago

Strange, You should see traces on screen (docker logs ...) Perhaps start little by little. You have a lot of env vars. Remove all of them, just keep (I was going to say "-fg").

I know very little about docker, not my thing but I believe most people run the broker in foreground inside its container. So, first of all, try adding the -fg options. Env var is called ORIONLD_FOREGROUND. Set it to TRUE.

If that doesn't work, then remover all env var except the ORIONLD_FOREGROUND and if that "works", add your other env vars little by little

ttutuncu commented 1 year ago

We have stripped all the Environment variables and only left the DB ones and we still have the same problem. We also enabled the foreground.

Somehow we are not getting any traces even if we set env variable, or give it as a parameter in the command as you stated in your previous message.

We used strace and saw an output as follows:

recvfrom(9, "\0H\0\2BUHAR_KAZANI_EH_ADI\0\7\0\0\0Hay\304"..., 21202044, 0, NULL, NULL) = 257686
recvfrom(9, "\0\2ILAN_REKLAM_EH\0\7\0\0\0Hay\304\261r\0\2ISY"..., 20944358, 0, NULL, NULL) = 274222
recvfrom(9, "ate\0\357\227\300\216\233/\331A\2value\0\1\0\0\0\0\4mdNames"..., 20670136, 0, NULL, NULL) = 283868
recvfrom(9, "ld:Building:31911\0\4mdNames\0\5\0\0\0\0"..., 20386268, 0, NULL, NULL) = 129532
recvfrom(9, "KLI\0\0\0\0\0\0\0006@\2KULLANIM_SEKLI_ADI\0"..., 20256736, 0, NULL, NULL) = 316940
recvfrom(9, 0x7fd4e2cfae3c, 19939796, 0, NULL, NULL) = -1 EAGAIN (Resource temporarily unavailable)
poll([{fd=9, events=POLLIN|POLLERR|POLLHUP}], 1, 299983) = 1 ([{fd=9, revents=POLLIN}])
recvfrom(9, "_KAYIT_BELGE_TARIHI\0\1YIKILDI_CT\0"..., 19939796, 0, NULL, NULL) = 129532
recvfrom(9, "PU_ALANI\0\nYANGIN_RISKI\0\nMKS_YAPI"..., 19810264, 0, NULL, NULL) = 194298
recvfrom(9, "DI\0\7\0\0\0Hay\304\261r\0\nELEKTRIK_NO\0\nGAZ_"..., 19615966, 0, NULL, NULL) = 259064
recvfrom(9, "DI\0\7\0\0\0Hay\304\261r\0\2GAZ_ABONELIGI_EH\0"..., 19356902, 0, NULL, NULL) = 323830
recvfrom(9, "onArea\0\31\1\0\0\2type\0\f\0\0\0GeoProperty"..., 19033072, 0, NULL, NULL) = 115752
recvfrom(9, "Date\0\351`V\225\233/\331A\1modDate\0\351`V\225\233/\331A\2v"..., 18917320, 0, NULL, NULL) = 323830
recvfrom(9, "y\304\261r\0\nBOLUM_ADI\0\nKIRA_DEGERI\0\2IN"..., 18593490, 0, NULL, NULL) = 208078
recvfrom(9, "M_TIPI\0\0\0\0\0\0\0\0@\2MKS_BOLUM_TIPI_A"..., 18385412, 0, NULL, NULL) = 194298
recvfrom(9, "ttps://uri=etsi=org/ngsi-ld/defa"..., 18191114, 0, NULL, NULL) = 194298
recvfrom(9, "ARIHI\0\31\0\0\0002022-09-19T09:16:09.00"..., 17996816, 0, NULL, NULL) = 194298
recvfrom(9, "LIK_NO\0\nCADDE_SOKAK_YOH_KODU\0\nDU"..., 17802518, 0, NULL, NULL) = 259064
recvfrom(9, "\0\0\0M@\nALT_KAPI_NO\0\nBLOK_NO\0\nSITE"..., 17543454, 0, NULL, NULL) = 259064
recvfrom(9, "_NO\0\0\0\0\0\340\21\353@\1BINA_NO\0\0\0\0\0\200\217\342@\nPA"..., 17284390, 0, NULL, NULL) = 259064
recvfrom(9, "\0\nOLUM_NEDENLERI\0\2OLEN_VARMI_EH\0"..., 17025326, 0, NULL, NULL) = 246367
brk(NULL)                               = 0x14b0c1000
brk(0x14b0f1000)                        = 0x14b0f1000
brk(NULL)                               = 0x14b0f1000
brk(0x14b121000)                        = 0x14b121000
brk(NULL)                               = 0x14b121000
brk(0x14b151000)                        = 0x14b151000

It looks like it is accessing all the data at startup and somehow filling the memory. Is this how it works?

We are really stuck here especially without the debug logs.

This is our latest log with orion-ld v1.4.0 running. It took around 15 minutes to start the REST services.

INFO@12:20:09  orionld.cpp[1110]: Orion Context Broker is running
TMP@12:20:09  orionld.cpp[787]: Version Info:
TMP@12:20:09  orionld.cpp[788]: -----------------------------------------
TMP@12:20:09  orionld.cpp[789]: orionld version:    1.4.0
TMP@12:20:09  orionld.cpp[790]: based on orion:     1.15.0-next
TMP@12:20:09  orionld.cpp[791]: core @context:      https://uri.etsi.org/ngsi-ld/v1/ngsi-ld-core-context-v1.6.jsonld
TMP@12:20:09  orionld.cpp[792]: git hash:           nogitversion
TMP@12:20:09  orionld.cpp[793]: build branch:       
TMP@12:20:09  orionld.cpp[794]: compiled by:        root
TMP@12:20:09  orionld.cpp[795]: compiled in:        
TMP@12:20:09  orionld.cpp[796]: -----------------------------------------
NOTICE:  extension "postgis" already exists, skipping
TMP@12:20:09  mongocInit.cpp[291]: Connecting to mongo for the C driver
NOTICE:  extension "postgis" already exists, skipping
ERROR@12:35:20  orionldRequestSend.cpp[327]: Internal Error (curl_easy_perform returned error code 28)
ERROR@12:35:20  orionldRequestSend.cpp[335]: curl_easy_perform error 28
ERROR@12:35:20  orionldContextDownload.cpp[115]: orionldRequestSend failed (try number 1 out of 3. Timeout is: 5000ms): Internal CURL Error
INFO@12:35:21  orionld.cpp[1302]: Startup completed
TMP@12:35:21  orionld.cpp[1317]: Initialization is Done
TMP@12:35:21  orionld.cpp[1318]:   Accepting REST requests on port 1026 (experimental API endpoints are enabled)
TMP@12:35:21  orionld.cpp[1319]:   TRoE:                    Enabled
TMP@12:35:21  orionld.cpp[1320]:   Distributed Operation:   Disabled
TMP@12:35:21  orionld.cpp[1321]:   Health Check:            Disabled
TMP@12:35:21  orionld.cpp[1324]:   Postgres Server Version: 14.0.7
TMP@12:35:21  orionld.cpp[1326]:   Mongo Server Version:    3.6.23-13.0
TMP@12:35:21  orionld.cpp[1330]:   Mongo Driver:            mongoc driver- ONLY (MongoDB C++ Legacy Driver is DISABLED)
TMP@12:35:21  orionld.cpp[1331]:   MongoC Driver Version:   1.22.0
TMP@12:35:33  orionldMhdConnectionInit.cpp[982]: ------------------------- Servicing NGSI-LD request 001: GET /ngsi-ld/v1/entities --------------------------
TMP@12:35:33  orionldMhdConnectionInit.cpp[982]: ------------------------- Servicing NGSI-LD request 002: GET /ngsi-ld/v1/entities --------------------------
TMP@12:35:33  orionldMhdConnectionInit.cpp[982]: ------------------------- Servicing NGSI-LD request 003: GET /ngsi-ld/v1/entities --------------------------
TMP@12:35:33  orionldMhdConnectionInit.cpp[982]: ------------------------- Servicing NGSI-LD request 004: GET /ngsi-ld/v1/entities --------------------------
TMP@12:35:33  orionldMhdConnectionInit.cpp[982]: ------------------------- Servicing NGSI-LD request 005: GET /ngsi-ld/v1/entities --------------------------
kzangeli commented 1 year ago

This is getting more and more weird by the moment!!! :))) The "latest log" is looking good, normal. Only, it should take no more than a second to see it, including spinning up the container, In a native environment (no container, just running in the OS) it takes a few milliseconds.

I believe you need to consult with someone that knows docker/containers. I'll try to get ahold of a guy from the Foundation, dopcker/container expert, see if he's free for a comment.

ttutuncu commented 1 year ago

It would be great to seek some advice. Thanks.

In the initialization code I noticed these:

if (mongocTenantsGet() == false)
    LM_X(1, ("Unable to extract tenants from the database - fatal error"));

  if (mongocGeoIndexInit() == false)
    LM_X(1, ("Unable to initialize geo indices in database - fatal error"));

  if (mongocIdIndexCreate(&tenant0) == false)
    LM_W(("Unable to create the index on Entity ID on the default database"));

Is it creating indexes on each startup? If so we have a load of polygon data, could this be the case with the mongocGeoIndexInit ?

In mongocGeoIndexInit it looks like it is going through all the attributes and create indexes on ones which are GeoProperty:

mongocGeoIndexInit(void)
{
  bson_t            mongoFilter;
  mongoc_cursor_t*  mongoCursorP;
  const bson_t*     mongoDocP;
  bson_t*           options = BCON_NEW("projection", "{",
                                         "attrs", BCON_BOOL(true),
                                         "_id",   BCON_BOOL(false),
                                       "}");
  //
  // Create the filter for the query - no restriction, we want all entities!
  //
  bson_init(&mongoFilter);

  //
  // DB Connection
  //
  mongocConnectionGet();

  //
  // Loop over all tenants
  //
  OrionldTenant* tenantP = &tenant0;  // tenant0->next == tenantList :)
  tenant0.next = tenantList;          // Better safe than sorry!

  tenantP = &tenant0;
  while (tenantP != NULL)
  {
    mongoc_collection_t* mCollectionP;

    //
    // Get handle to collection
    //
    mCollectionP = mongoc_client_get_collection(orionldState.mongoc.client, tenantP->mongoDbName, "entities");

    //
    // Run the query
    //
    if ((mongoCursorP = mongoc_collection_find_with_opts(mCollectionP, &mongoFilter, options, NULL)) == NULL)
    {
      LM_E(("Internal Error (mongoc_collection_find_with_opts ERROR)"));
      bson_destroy(options);
      bson_destroy(&mongoFilter);
      mongoc_collection_destroy(mCollectionP);
      mongoc_cursor_destroy(mongoCursorP);
      return NULL;
    }

    while (mongoc_cursor_next(mongoCursorP, &mongoDocP))
    {
      char*   title;
      char*   detail;
      KjNode* entityNodeP;
      KjNode* attrsP;

      entityNodeP = mongocKjTreeFromBson(mongoDocP, &title, &detail);
      if (entityNodeP == NULL)
      {
        LM_W(("mongocKjTreeFromBson failed"));
        continue;
      }

      attrsP = entityNodeP->value.firstChildP;
      if (attrsP == NULL)  //  Entity without attributes ?
      {
        LM_W(("Entity without attributes?"));
        continue;
      }

      //
      // Foreach Attribute, check if GeoProperty and if so create its geo index
      //
      for (KjNode* attrP = attrsP->value.firstChildP; attrP != NULL; attrP = attrP->next)
      {
        KjNode* typeP = kjLookup(attrP, "type");

        if (typeP == NULL)
        {
          LM_E(("Database Error (attribute '%s' has no 'type' field)", attrP->name));
          continue;
        }

        if (typeP->type != KjString)
        {
          LM_E(("Database Error (attribute with a 'type' field that is not a string)"));
          continue;
        }

        if (strcmp(typeP->value.s, "GeoProperty") == 0)
        {
          if (dbGeoIndexLookup(tenantP->tenant, attrP->name) == NULL)
            mongocGeoIndexCreate(tenantP, attrP->name);
        }
      }
    }

    mongoc_cursor_destroy(mongoCursorP);
    mongoc_collection_destroy(mCollectionP);
    tenantP = tenantP->next;
  }

  bson_destroy(options);
  bson_destroy(&mongoFilter);

  return true;
}
kzangeli commented 1 year ago

Still trying to locate an expert that is not on vacation ... Yes, the geo properties need to be indexed. Queries on geolocation doesn't work otherwise. BUT, it takes 25 minutes for your broker to start ... 25 minutes. Unless you have thousands of Terabytes of entities, I can't see how this could cause what you're seeing

kzangeli commented 1 year ago

Ah, an idea. Try starting the broker against an empty database. You can do that by modifying the "prefix" of the database name, being "orion" by default.The CLI option is -db <name>. Env var: ORIONLD_MONGO_DB. Set it to "xxx" for example :)

If the broker starts right away that way, then I think it's safe to assume your "I assume huge" database is why the broker takes so long to start, and if that is the case, we need to break down exactly why.

ttutuncu commented 1 year ago

Hi Ken,

We tried your idea with an empty database and it started up in a second. So definately the data collection from mongodb is taking ages.

We currently have 370k entities, 1.7GB collection size. I don't think this is a large data set for a city and we are still in development. We will eventually have much more data.

kzangeli commented 1 year ago

ok, very good input!

I will have to do some experimentation,. starting with a few thousand entities with a location attribute, and see how it affects things.

As input for these tests, it would be good to have an understanding of your system. I don't need your entire DB, just an idea on how your entities look, so, a few questions/answers:

ttutuncu commented 1 year ago

We typically use 2 GeoProperties for each entity: "location" & "locationArea" I also have a question here: Does the Geospatial search function work for all GeoProperty attributes. In the specification I think I read somewhere that we have to use "location","observationSpace" & "operationalSpace". Is this true?

Our building entity is the one with the highest properties which is around 100 attributes. But because it is not final it is going to be reduced. We typically have around 15 different entities. But all include the same GeoProperties at the moment.

Multiservice is enabled but currently we will be using 1 tenant.

Our platform is not production level. We have a 3 node MongoDB cluster with normal disks, not SSD (will be upgraded at the end of the month). We have 2 CPU with 20GB RAM on a Kubernetes worker node with several applications running.

Looking at the code where the attributes are checked if they are a geoproperty includes getting all of the attributes for each document, which is not very performant. What I suggest is to do the querying on the DB level and only recieve the attributes which are a GeoProperty. I have written a MongoDB aggregation which filters attributes that are of type GeoProperty and groups them to be distinct. This aggregation query takes about 22 secs (on a test system without SSD disks) to find the distinct key names of the attributes which include GeoProperty. If you are doing other checks it is best to do this on DB level as well. This is just a thought that we can work on. This data of ours is going to reach millions of documents and they way it creates indexes definately needs some improvements. We should not be getting all the attributes in the database to the code side.

Like to hear your thoughts.

db.entities.aggregate([
    {
        $project: {
            attributeKeys: {
                $filter: {
                    input: { $objectToArray: "$attrs" },
                    as: "item",
                    cond: { $eq: ["$$item.v.type", "GeoProperty"] }
                }
            }
        }
    },
    {
        $unwind: "$attributeKeys"
    },
    {
        $group: {
            _id: "$attributeKeys.k"
        }
    }
]);

This gives the following result which I think is enough:

[{
  "_id": "https://uri=etsi=org/ngsi-ld/default-context/locationArea"
},{
  "_id": "location"
}]
ttutuncu commented 1 year ago

By the way we can also write queries for checking the type field not exist and the type is not a string controls on the DB. Which will return as a list to you so you can write it to the logs. I can help on the queries. We can also have a session to discuss it.

kzangeli commented 1 year ago

As it seems like you know what you're talking about :), I'd more than happy to receive a Pull Request for your proposed changes ... :)

ttutuncu commented 1 year ago

Unfortunately I am not proficient in C language and mongoc library :) But I can help with the queries :)

kzangeli commented 1 year ago

ok!

kzangeli commented 1 year ago

As we meet today, I read through all the comments. There's one question that I haven't answered:

I also have a question here: Does the Geospatial search function work for all GeoProperty attributes. In the specification I think I read somewhere that we have to use "location","observationSpace" & "operationalSpace". Is this true?

The short answer is "No, you can use any name you like". The long answer is "but, you'd need to indicate to the broker the name of the geoproperty, using a URL parameter called geoproperty". Just be aware, every "new" name for a geoproperty will provoke an additional index in mongo. You will want to minimize the number of different names for geoproperties. About, "location","observationSpace" & "operationalSpace", those three are just predefined names for geoproperties that we "recommend" from ETSI ISG CIM (where the NGSI-LD API is defined).

I'd also like to take the opportunity to say I'm sorry for not having solved this issue (from August!!!) still ... Very sorry about that. I'm just swamped with work ...

We just hired a new developer for the broker in the FIWARE Foundation and I'm putting him on this TODAY.

See you very soon!

ttutuncu commented 1 year ago

Thank you for your response. As we have discussed I am providing a summary of the situation and a brief analysis of what needs to be done. We can work out the wildcard index part as the latest version can now run on MongoDB 4.4, but the startup part is in a higher priority so lets fix this first. If you require anything further, please don't hesitate to ask.

Current Challenge

Our Orion-LD instance, hosted on a 2CPU 20GB RAM Kubernetes worker node, is experiencing substantial resource consumption during startup, often exceeding the allocated CPU and RAM resources. This issue not only extends the startup time to approximately 25 minutes but also sometimes results in failures due to 'oomkilled' errors when the RAM is insufficient. Moreover, we have been unable to retrieve debug traces from the application, despite various attempts using different configurations.

Tested Versions:

Current Data Volume:

Anticipated Data Volume:

Technical Analysis

Upon a detailed analysis of the initialization segment of the code, I pinpointed the mongocGeoIndexInit function as the primary contributor to the delay. This function retrieves all attribute objects from the database, causing a spike in RAM usage, and iterates through each attribute to create indexes for those identified as GeoProperty. This process is not only time-consuming but also redundant, as it calls mongocGeoIndexCreate function multiple times for attributes with identical GeoProperty keys.

Here is a snippet of the current initialization code where the delay occurs:

mongocGeoIndexInit(void)
{
  // ... (existing code)

  while (mongoc_cursor_next(mongoCursorP, &mongoDocP))
  {
    // ... (existing code)

    //
    // Foreach Attribute, check if GeoProperty and if so create its geo index
    //
    for (KjNode* attrP = attrsP->value.firstChildP; attrP != NULL; attrP = attrP->next)
    {
      KjNode* typeP = kjLookup(attrP, "type");

      if (typeP == NULL)
      {
        LM_E(("Database Error (attribute '%s' has no 'type' field)", attrP->name));
        continue;
      }

      if (typeP->type != KjString)
      {
        LM_E(("Database Error (attribute with a 'type' field that is not a string)"));
        continue;
      }

      if (strcmp(typeP->value.s, "GeoProperty") == 0)
      {
        if (dbGeoIndexLookup(tenantP->tenant, attrP->name) == NULL)
          mongocGeoIndexCreate(tenantP, attrP->name);
      }
    }
  }

  // ... (existing code)
}

Proposed Solution

To enhance the efficiency and performance of the initialization process, I propose modifying the method of querying and indexing of GeoProperties. By implementing a MongoDB aggregation that filters and groups distinct GeoProperty attributes at the database level, we can significantly reduce the startup time and resource consumption. This approach would only return a minimal number of records, thereby streamlining the process.

Here are the MongoDB aggregations that can be utilized to identify distinct GeoProperty attributes, attributes without a type field, and attributes with a non-string type field:

1. Finding Attributes of Type GeoProperty:

 db.entities.aggregate([
       {
           $project: {
               attributeKeys: {
                   $filter: {
                       input: { $objectToArray: "$attrs" },
                       as: "item",
                       cond: { $eq: ["$$item.v.type", "GeoProperty"] }
                   }
               }
           }
       },
       {
           $unwind: "$attributeKeys"
       },
       {
           $group: {
               _id: "$attributeKeys.k"
           }
       }
   ]);

Example Output:

[{
  "_id": https://uri=etsi=org/ngsi-ld/default-context/locationArea
},{
  "_id": "location"
}]

2. Finding Attributes Without a Type Field:

db.entities.aggregate( [
    {
        $project: {
            attributeKeys: {
                $filter: {
                    input: { $objectToArray: "$attrs" },
                    as: "item",
                    cond: { $eq: [ { $ifNull: [ "$$item.v.type", null ] }, null ] }
                }
            }
        }
    },
    {
        $unwind: "$attributeKeys"
    },
    {
        $group: {
            _id: "$_id",
            attrs: { $push: "$attributeKeys.k" }
        }
    },
    {
        $project: {
            id: "$_id",
            attrs: 1,
            _id: 0
        }
    }
]);

Example Output:

[{
  "attrs": [
    https://uri=etsi=org/ngsi-ld/default-context/cameraType
  ],
  "id": {
    "id": "urn:ngsi-ld:Camera:10.10.0.135",
    "type": https://uri.etsi.org/ngsi-ld/default-context/Camera,
    "servicePath": "/"
  }
},
{
  "attrs": [
    https://uri=etsi=org/ngsi-ld/default-context/cameraName,
    https://uri=etsi=org/ngsi-ld/default-context/brandName
  ],
  "id": {
    "id": "urn:ngsi-ld:Camera:10.10.0.136",
    "type": https://uri.etsi.org/ngsi-ld/default-context/Camera,
    "servicePath": "/"
  }
}
]

3. Finding Attributes with a Non-String Type Field:

db.entities.aggregate( [
    {
        $project: {
            attributeKeys: {
                $filter: {
                    input: { $objectToArray: "$attrs" },
                    as: "item",
                    cond: { 
                        $and: [ 
                            { $ne: [ { $ifNull: [ "$$item.v.type", null ] }, null ] },
                            { $ne: [ { $type: "$$item.v.type" }, "string" ] }
                        ]
                    }
                }
            }
        }
    },
    {
        $unwind: "$attributeKeys"
    },
    {
        $group: {
            _id: "$_id",
            attrs: { $push: "$attributeKeys.k" }
        }
    },
    {
        $project: {
            id: "$_id",
            attrs: 1,
            _id: 0
        }
    }
]);

Example Output:

{
  "attrs": [
    https://uri=etsi=org/ngsi-ld/default-context/cameraName
  ],
  "id": {
    "id": "urn:ngsi-ld:Camera:10.10.0.136",
    "type": https://uri.etsi.org/ngsi-ld/default-context/Camera,
    "servicePath": "/"
  }
}

Furthermore, I suggest considering the implementation of wildcard indexes on the attrs object to enhance performance of GeoProperty type lookup, a feature supported in MongoDB 4.x and above. This will help us search the type value straight from the index.

Conclusion and Next Steps

I firmly believe that these proposed modifications will substantially improve the initialization process, preparing us for the scalability required as we anticipate our data growing to millions of documents.

I am eager to collaborate closely with your team to refine and implement this solution, ensuring a robust and efficient system as we move forward. Your expertise and feedback will be invaluable in this endeavor.