FIWARE / context.Orion-LD

Context Broker and CEF building block for context data management which supports both the NGSI-LD and the NGSI-v2 APIs
https://www.etsi.org/deliver/etsi_gs/CIM/001_099/009/01.06.01_60/gs_CIM009v010601p.pdf
GNU Affero General Public License v3.0
51 stars 43 forks source link

Subscription Warning #1391

Open FR-ADDIX opened 1 year ago

FR-ADDIX commented 1 year ago

What does the Orion-LD V 1.2.0 want to tell me with the following message?

time=Wednesday 21 Jun 10:06:50 2023.306Z | lvl=WARN | corr=N/A | trans=N/A | from=N/A | srv=N/A | subsrv=N/A | comp=Orion | op=subCacheAlterationMatch.cpp[148]:matchLookup | msg=Different entity (urn:ngsi-ld:AirQualityObserved:Umweltbundesamt:DESH056 vs urn:ngsi-ld:AirQualityObserved:Umweltbundesamt:DESH027) - need to add it to the notification for sub urn:ngsi-ld:subscription:c4ac8dc8-0ed5-11ee-9265-be65a175e8ce

There are currently 4 entities:

[ { "id": "urn:ngsi-ld:AirQualityObserved:Umweltbundesamt:DESH027", "type": "AirQualityObserved", "name": { "type": "Property", "value": "Kiel-Bahnhofstr. Verk. Umweltbundesamt DESH027" } }, { "id": "urn:ngsi-ld:AirQualityObserved:Umweltbundesamt:DESH052", "type": "AirQualityObserved", "name": { "type": "Property", "value": "Kiel Theodor-Heuss-Ring Verk. Umweltbundesamt DESH052" } }, { "id": "urn:ngsi-ld:AirQualityObserved:Umweltbundesamt:DESH056", "type": "AirQualityObserved", "name": { "type": "Property", "value": "Eggebek Umweltbundesamt DESH056" } }, { "id": "urn:ngsi-ld:AirQualityObserved:Umweltbundesamt:DESH057", "type": "AirQualityObserved", "name": { "type": "Property", "value": "Kiel Bremerskamp Verk. Umweltbundesamt DESH057" } } ]

This is the subscription that is addressed in the warning:

{ "id": "urn:ngsi-ld:subscription:c4ac8dc8-0ed5-11ee-9265-be65a175e8ce", "type": "Subscription", "subscriptionName": "QL:AirQualityObserved", "description": "Historisierung der AirQualityObserved Daten", "entities": [ { "idPattern": "AirQualityObserved", "type": "AirQualityObserved" } ], "watchedAttributes": [ "dateObserved" ], "status": "active", "isActive": true, "notification": { "format": "normalized", "endpoint": { "uri": "http://quantumleap8.fiware-staging.svc:8668/v2/notify", "accept": "application/json", "receiverInfo": [ { "key": "Fiware-Service", "value": "infoportal" } ] }, "status": "ok", "timesSent": 19363, "lastNotification": "2023-06-21T10:26:57.486Z", "lastFailure": "2023-06-21T09:56:50.284Z", "lastSuccess": "2023-06-21T10:26:57.486Z" }, "origin": "cache" }

is there something wrong?

kzangeli commented 1 year ago

Probably just an old trace line I overlooked before PR. Is anything failing?

FR-ADDIX commented 1 year ago

Actually, everything is running quite smoothly. But every now and then we have a crash with error 139 When analyzing how this occurs, we noticed the warning.

Last State: Terminated Reason: Error Exit Code: 139 Started: Tue, 20 Jun 2023 04:54:29 +0200 Finished: Tue, 20 Jun 2023 07:27:46 +0200

kzangeli commented 1 year ago

exit code 139 ... Any idea where that error code comes from? I don't think the broker itself ever exits with a 139. Perhaps a docker thing (might be the broker allocates too much RAM - heard something about that) ?

FR-ADDIX commented 1 year ago

No, it is actually very frugal with RAM and CPU at the moment. We live in a K8s environment and allow 2 CPUs and 16GB RAM. The POD can get this in 50m CPU and 128MB blocks. But of course it can be that just at the time on the cluster not sufficient resources are available because a completely different independent process has fetched the resources. We will observe this, the tip may have been helpful.

kzangeli commented 1 year ago

ok, I know from performance tests that kubernetes kills the broker if it allocates too much memory. Normally this problem would be taken care of by the swapping of the OS, but, unfortunately kubernetes doesn't support swapping (someone from RedHat told me that this was going to be solved - kubernetes supporting swapping).

kzangeli commented 1 year ago

Quick search on "kubernetes error 139" gave me this:

Exit Code 139 means that the container received a SIGSEGV signal from the operating system. This indicates a segmentation error – a memory violation, caused by a container trying to access a memory location to which it does not have access.

Might be the broker crashes for you. If that is the case, I'd be really interested in getting more info on that. For example, could you start the broker inside valgrind? [ Valgrind would tell us exactly where (well, more or less) the problem lies - broker inside gdb would also work ]

FR-ADDIX commented 1 year ago

I have now lowered the RAM block requirement from 128mi to 64mi and have had no failures for several hours now. We will continue to monitor this over the weekend and report back on Monday.