project-chip / connectedhomeip

Matter (formerly Project CHIP) creates more connections between more objects, simplifying development for manufacturers and increasing compatibility for consumers, guided by the Connectivity Standards Alliance.
https://buildwithmatter.com
Apache License 2.0
7.34k stars 1.97k forks source link

[EFR32] Running out of memory to flatten incoming packets #21265

Open rosahay-silabs opened 2 years ago

rosahay-silabs commented 2 years ago

Problem

bzbarsky-apple commented 2 years ago
[IN] No memory to flatten incoming packet buffer chain of size 1452

@rosahay-silabs This is not "running out of memory". This is "incoming thing is larger than what we can fit in a PacketBuffer". In particular, 1452 > 1280, no?

To be precise, we're in UDPEndPointImplLwIP::LwIPReceiveUDPMessage and we're calling PacketBufferHandle::New with (1452, 0) as the arguments. That's going to fail if kMaxSizeWithoutReserve is 1280, as it sounds like it is in your case.

So the real question is: why are we being handed a 1452-byte thing?

The log shows this happening earlier too, followed by a factory reset. Did we get into this state twice in that log? Or is this just something that happens sometimes as other things on the network send us packets that might not fit into a single IPv6 minimal MTU?

rosahay-silabs commented 2 years ago

@bzbarsky-apple ,

The log shows this happening earlier too, followed by a factory reset. Did we get into this state twice in that log? Or is this just something that happens sometimes as other things on the network send us packets that might not fit into a single IPv6 minimal MTU?

The log here has a full round of idle testing. The testing being done here is, first commission the DUT -> idle for 1 hour -> do basic cases like unicast, multicast, then re-commission.

I have to run a wireshark for getting the answer to why we get 1452-byte buffer.

In another issue, the logic around LwIPReceiveUDPMessage seemed incorrect, something which is being looked at #20923 . Any insights around that ?

bzbarsky-apple commented 2 years ago

I don't have any great insights on #20923; I haven't had a chance to dig through those bits yet. It's possible that we're getting incorrect packet sizes for some reason as a result of that, yes...

praveenCY commented 2 years ago

seen the same issue on Infineon platform with latest master branch

CHIP:CSL: PacketBuffer: allocation too large. CHIP:IN: Cannot copy received pbuf of size 1445

Packet capture below image

andy31415 commented 2 years ago

I would generally expect mdns to have a packet size limit. Does this cause actual errors in functionality or are we just refusing to process such packets (and error logging as a result)? just refusing does not seem as bad at first glance as these mDNS packets are not matter-specific.

praveenCY commented 2 years ago

RAW Packet which is causing this problem:

3333000000fbeca90710a06d86dd60070b0005ad11fffe800000000000000c8b3928c5872d64ff0200000000000000000000000000fb14e914e905adb29e0000840000000016000000012037302d33352d36302d36332e3120456e7465727461696e6d656e7420526f6f6d0c5f736c6565702d70726f7879045f756470056c6f63616c000010800100001194000100c02d000c0001000011940002c00cc00c0021800100000078001b00000000c0bc12456e7465727461696e6d656e742d526f6f6dc03f3565633a61393a30373a31303a61303a366440666538303a3a656561393a3766663a666531303a613036642d737570706f72747352500e5f6170706c652d6d6f6264657632045f746370c03f0010800100001194000100095f7365727669636573075f646e732d7364c03a000c0001000011940002c0bc083935373930383038045f737562c0bc000c0001000011940002c086c0bc000c0001000011940002c086c08600218001000000780008000000007ef2c071104d79486f6d6531343030373937333039085f6d657368636f70c03a0010800100001194005c0472763d31136e6e3d4d79486f6d65313430303739373330390b78703dd62b5f755c0d42020874763d312e322e300d766e3d4170706c6520496e632e0b78613d921b70438a2fa3c70b64643d921b70438a2fa3c70773623d00000031c0dd000c0001000011940002c14cc14c000c0001000011940002c13b12456e7465727461696e6d656e7420526f6f6d095f7372706c2d746c73c0cb0010800100001194003f23646f6d61696e3d6f70656e7468726561642e7468726561642e686f6d652e617270612e1a7365727665722d69643d37326565353366396363396562663335c0dd000c0001000011940002c1ecc1ec000c0001000011940002c1d912456e7465727461696e6d656e7420526f6f6d0c5f6465766963652d696e666fc0cb0010000100001194000d0c6d6f64656c3d4a3330354150c13b0021800100000078000800000000c027c071c1d900218001000000780008000000000355c07112456e7465727461696e6d656e7420526f6f6d0f5f636f6d70616e696f6e2d6c696e6bc0cb001080010000119400be0772704d61633d32117270484e3d6130623836313833643339300c7270466c3d3078423637383211727048413d6564656265333930313339301072704d643d4170706c65545631312c310b727056723d3430302e353111727041443d38636562623363656639656111727048493d39353839313235666164313716727042413d38363a32323a31453a33393a37433a44332c72704d527449443d30394538334241432d353537312d343944312d424146302d434345454534463744343837c0dd000c0001000011940002c2d1c2d1000c0001000011940002c2bec2be0021800100000078000800000000c002c07112456e7465727461696e6d656e7420526f6f6d085f616972706c6179c0cb001080010000119401850561636c3d30186274616464723d30303a30303a30303a30303a30303a30301a64657669636569643d45433a41393a30373a30343a46353a4241126665783d3164392f5374352f4662776f6f511e66656174757265733d307834413746444644352c307842433135374644450b666c6167733d3078363434286769643d43344434424245342d464245332d343644392d394337412d3438393338443542413734420569676c3d31066763676c3d31116d6f64656c3d4170706c65545631312c310d70726f746f766572733d312e312770693d37393836303863642d346638622d343639332d626461352d326164366261613032386332287073693d30394538334241432d353537312d343944312d424146302d43434545453446374434383743706b3d3032613239643932376539366431356362383862623430386663663330623436646632316630303463303062623132346531626433306531633238373361346210737263766572733d3633352e38372e330b6f73766572733d31362e300476763d3200002905a00000119400120004000e0000eca90704f5baeca90710a06d

stale[bot] commented 1 year ago

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs.

stale[bot] commented 11 months ago

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs.

maxim-sloyko-yohana commented 7 months ago

To be precise, we're in UDPEndPointImplLwIP::LwIPReceiveUDPMessage and we're calling PacketBufferHandle::New with (1452, 0) as the arguments. That's going to fail if kMaxSizeWithoutReserve is 1280, as it sounds like it is in your case.

Doesn't this implementation defeat the purpose of the PBUF fragmentation in LwIP? Correct me if I'm wrong, but in a network with a lot of small-packet traffic (AFAICT most of mDNS packets are < 250), every packet still needs to be given a full ~1.5k PBUF, which is quite a lot of overhead. Am I missing something? Is there a way to solve this, i.e. at the same time to be able to handle a lot of small packets and occasional big ones without a ton of overhead?