FIWARE / context.Orion-LD

Context Broker and CEF building block for context data management which supports both the NGSI-LD and the NGSI-v2 APIs
https://www.etsi.org/deliver/etsi_gs/CIM/001_099/009/01.06.01_60/gs_CIM009v010601p.pdf
GNU Affero General Public License v3.0
51 stars 43 forks source link

Memory issue #900

Open Neeraj-Nec opened 3 years ago

Neeraj-Nec commented 3 years ago

Hi ,

I am getting memory following issue while testing (upset , subscription and Query by ID API). Error log: I am getting following memory issue. /ngsi-ld/v1/subscriptions/ -------------------------- time=Wednesday 30 Jun 08:49:07 2021.496Z | lvl=TMP | corr=N/A | trans=N/A | from=N/A | srv=N/A | subsrv=N/A | comp=Orion | op=orionldMhdConnectionInit.cpp[527]:orionldMhdConnectionInit | msg=------------------------- Servicing NGSI-LD request 57601: POST /ngsi-ld/v1/subscriptions/ -------------------------- time=Wednesday 30 Jun 08:49:07 2021.497Z | lvl=TMP | corr=N/A | trans=N/A | from=N/A | srv=N/A | subsrv=N/A | comp=Orion | op=orionldMhdConnectionInit.cpp[527]:orionldMhdConnectionInit | msg=uri.etsi.org/ngsi-ld/v1/ngsi-ld-core-context.jsonld free(): invalid size

df output of VM

Filesystem 1K-blocks Used Available Use% Mounted on udev 2013860 0 2013860 0% /dev tmpfs 404632 41404 363228 11% /run /dev/vda1 40593708 9144408 31432916 23% / tmpfs 2023140 0 2023140 0% /dev/shm tmpfs 5120 0 5120 0% /run/lock tmpfs 2023140 0 2023140 0% /sys/fs/cgroup tmpfs 404632 0 404632 0% /run/user/1000 overlay 40593708 9144408 31432916 23% /var/lib/docker/overlay2/c8b755ca4b407f3d5cd4cea6f67644a8dd2dbe9f9cfb9ebe0b61f1616856cfeb/merged shm 65536 0 65536 0% /var/lib/docker/containers/aeb6d84ae9565da15168e4c4b6ab2af11aa605bb120f3cc62d568952ffccf23b/mounts/shm overlay 40593708 9144408 31432916 23% /var/lib/docker/overlay2/5850ca9567bcd934e717575a739a03db40068aa9863070216814283ce6d6ebc1/merged shm 65536 8 65528 1% /var/lib/docker/containers/0a013b99db69ede9d88166308f5e0932b9892ceaefdb48d0b41dedb1afba1cf6/mounts/shm

free –m output

          total        used        free      shared  buff/cache   available

Mem: 3951 382 2414 40 1154 3227 Swap: 0 0 0

JMeter script log :

Created the tree successfully using /root/JMeter/NGSILdOrionSub.jmx Starting standalone test @ Wed Jun 30 08:34:58 UTC 2021 (1625042098772) Waiting for possible Shutdown/StopTestNow/HeapDump/ThreadDump message on port 4445 summary + 217 in 00:00:01 = 347.8/s Avg: 2 Min: 1 Max: 151 Err: 0 (0.00%) Active: 3 Started: 4 Finished: 1 summary + 67738 in 00:00:30 = 2257.9/s Avg: 58 Min: 0 Max: 1084 Err: 1943 (2.87%) Active: 170 Started: 361 Finished: 191 summary = 67955 in 00:00:31 = 2219.0/s Avg: 58 Min: 0 Max: 1084 Err: 1943 (2.86%) summary + 153010 in 00:00:30 = 5100.3/s Avg: 10 Min: 0 Max: 7011 Err: 153010 (100.00%) Active: 62 Started: 1132 Finished: 1070 summary = 220965 in 00:01:01 = 3644.8/s Avg: 25 Min: 0 Max: 7011 Err: 154953 (70.13%) summary + 31036 in 00:00:06 = 5568.0/s Avg: 7 Min: 0 Max: 1746 Err: 31036 (100.00%) Active: 0 Started: 1261 Finished: 1261 summary = 252001 in 00:01:06 = 3806.7/s Avg: 23 Min: 0 Max: 7011 Err: 185989 (73.80%) Tidying up ... @ Wed Jun 30 08:36:05 UTC 2021 (1625042165575) ... end of run

kzangeli commented 3 years ago

OK, that's interesting ... might be a memory corruption - the worst kind of bug! However, with the info you're giving me here there is really not much to go on.

If you could make your run as small as possible, and still provoking the error, and post here the exact requests (and I mean exact - I'd need every header, every single byte of payload body, etc, to reproduce the error), then I could try and reproduce it myself. Once I'm able to reproduce an error, it's normally (99%) a case of a few hours of debugging. [ If it's really a memory corruption, then it will probably take longer, but normally not more than a few days. ]

So, please, narrow down, and give me all the details and I'll do my best to fix the error. No one is more interesting in fixing it than me :)

Thanks for reporting!

Neeraj-Nec commented 3 years ago

Hi Again i am getting the following issue .

nldMhdConnectionInit.cpp[527]:orionldMhdConnectionInit | msg=------------------------- Servicing NGSI-LD request 16529: POST /ngsi-ld/v1/subscriptions/ -------------------------- time=Thursday 08 Jul 01:39:28 2021.802Z | lvl=TMP | corr=N/A | trans=N/A | from=N/A | srv=N/A | subsrv=N/A | comp=Orion | op=orionldMhdConnectionInit.cpp[527]:orionldMhdConnectionInit | msg=------------------------- Servicing NGSI-LD request 16530: POST /ngsi-ld/v1/subscriptions/ -------------------------- time=Thursday 08 Jul 01:39:28 2021.803Z | lvl=TMP | corr=N/A | trans=N/A | from=N/A | srv=N/A | subsrv=N/A | comp=Orion | op=orionldMhdConnectionInit.cpp[527]:orionldMhdConnectionInit | msg=------------------------- Servicing NGSI-LD request 16461: POST /ngsi-ld/v1/subscriptions/ -------------------------- time=Thursday 08 Jul 01:39:28 2021.803Z | lvl=TMP | corr=N/A | trans=N/A | from=N/A | srv=N/A | subsrv=N/A | comp=Orion | op=orionldMhdConnectionInit.cpp[527]:orionldMhdConnectionInit | msg=------------------------- Servicing NGSI-LD request 16531: POST /ngsi-ld/v1/subscriptions/ -------------------------- time=Thursday 08 Jul 01:39:28 2021.803Z | lvl=TMP | corr=N/A | trans=N/A | from=N/A | srv=N/A | subsrv=N/A | comp=Orion | op=orionldMhdConnectionInit.cpp[527]:orionldMhdConnectionInit | msg=------------------------- Servicing NGSI-LD request 16421: POST /ngsi-ld/v1/subscriptions/ -------------------------- corrupted size vs. prev_size

My VM have 16GB of ram and 4 cpu

https://github.com/smartfog/fogflow/blob/orionBug/test/orionbug/NGSILdOrionSub.jmx is the location of my JMeter script. https://github.com/smartfog/fogflow/tree/development/test/orion-ld is the docker-compose file location for orion-ld

Please let me know in case you find any thing incorrect in JMtetre script

Neeraj-Nec commented 3 years ago

Hi Ken Zangelin,

I run the same script on Scorpio broker by just changing the port And all the request was successfully executed on scorpio broker.

kzangeli commented 3 years ago

I'd need an example, preferribly using requests in curl, as simple as possible, that still provokes the error. If I have a way to reproduce the issue I can fix it. Without that, not much I can do.