Open kpschoedel opened 2 years ago
@dhrishi
The payload pointer may be outside the pbuf depending how it's allocated. You'd have to trace the path for incoming packets
The payload pointer may be outside the pbuf depending how it's allocated. You'd have to trace the path for incoming packets
It looks like that is the case; the incoming buffers are PBUF_REF with the payload somewhere else.
The current PacketBuffer
code, and the LwIP receive path, fundamentally assume that the pbuf
and payload are part of one contiguous block, and use the space in betwwn, so this isn't going to work.
Oh, wow. So the fact that it ever worked at all was basically just luck? We were stomping on parts of the pbuf pool, but they happened to be ok to stomp on, by accident???
I think the reason it ‘works’ is that LwIP reads the whole low-level frame into a buffer, and then consumes up through the IP header, and returns a payload pointer to that point. So when the code reuses that space, by accessing it relative to the payload pointer, it works. #19244 tried accessing it relative to the header, and crashed.
EnsureReservedSize()
always claims success, because ReservedSize()
casts the pointer difference to unsigned, so everything looks like a large reserve.
I don't think there's any hope of replacing all the ‘reserve’ code and uses for 1.0, so I think the path forward is:
PBUF_RAM
or PBUF_POOL
(I don't know where to do this);CHIP_SYSTEM_CONFIG_PACKETBUFFER_LWIP_PBUF_TYPE
must be one of these buffer types;LwIPReceiveUDPMessage()
and PacketBuffer::New()
that I know of), do VerifyOrDie(p->flags & PBUF_TYPE_FLAG_STRUCT_DATA_CONTIGUOUS)
.So for incoming pbufs, the only thing we use the reserve stuff for is the packetinfo, right? I guess the problem is how we send through the packetbuffer and the packetinfo to the right Matter thread. Seems like the only options are:
PBUF_TYPE_FLAG_STRUCT_DATA_CONTIGUOUS
, because I think it should hold there....
- We do what you suggest above, send the packetinfo in the reserved space. This probably requires LwIP to do a copy of the data, right?
That may be so, though I don't know LwIP internals. (I speculate that it may be using the separate-payload model because some hardware may make a page-aligned payload advantageous.)
- We keep doing what we do now, with static asserts that a packetinfo is always smaller than the IP header .... Seems dubious.
- We allocate (heap? pool?) a struct that has the packetinfo and the pointer to the pbuf, and then pass that to the Matter thread.
Yes, @tcarmelveilleux suggested similar in #17213. But I naïvely thought that just fixing the alignment would be enough for now…
If the reserve concept doesn't get ripped out entirely (which would definitely simplify PacketBuffer), then ReservedSize()
should also check for the CONTIGUOUS flag and return zero otherwise.
If the reserve concept doesn't get ripped out entirely
I think we use it quite a bit on the save path to pre-reserve space for our headers....
then ReservedSize() should also check for the CONTIGUOUS flag and return zero otherwise.
Yes, that would make a lot of sense.
I checked how IPPacketInfo
class is used in UDPEndPointImplLwIP.cpp
. The packet info is copied in UDPEndPointImplLwIP::HandleDataReceived
.
We can modify the signature to void HandleDataReceived(chip::System::PacketBufferHandle && aBuffer, const PacketInfo &aPacketInfo);
and pass the packet info explicitly in the lambda.
Code will be like:
364 IPPacketInfo pktInfo;
365 pktInfo->SrcAddress = IPAddress(*addr);
366 pktInfo->DestAddress = IPAddress(*ip_current_dest_addr());
367 pktInfo->Interface = InterfaceId(ip_current_netif());
368 pktInfo->SrcPort = port;
369 pktInfo->DestPort = pcb->local_port;
370
371 ep->Retain();
372 CHIP_ERROR err = ep->GetSystemLayer().ScheduleLambda([ep, p = System::LwIPPacketBufferView::UnsafeGetLwIPpbuf(buf), pktInfo] {
373 ep->HandleDataReceived(System::PacketBufferHandle::Adopt(p), pktInfo);
374 ep->Release();
375 });
Will this eliminate the need to access the IP and link layer header of the pbuf? Doing such thing is always dangerous.
That's basically sweeping the heap-allocation of the IPPacketInfo under the rug into the lambda allocation, right? I would rather make any sort of sizeable allocations explicit, so if they fail we gracefully return error instead of throwing uncaught exceptions or whatever happens if we fail to allocate the lambda....
@kpschoedel In the SVE test the issue showed up again in group message tests.
In SessionManager.cpp
we are cloning the packet buffer:
659 while (!decrypted && iter->Next(groupContext))
660 {
661 // Optimization to reduce number of decryption attempts
662 if (groupId != groupContext.group_id)
663 {
664 continue;
665 }
666 msgCopy = msg.CloneData();
667 CryptoContext::NonceStorage nonce;
668 CryptoContext::BuildNonce(nonce, packetHeader.GetSecurityFlags(), packetHeader.GetMessageCounter(),
669 packetHeader.GetSourceNodeId().Value());
670 decrypted = (CHIP_NO_ERROR ==
671 SecureMessageCodec::Decrypt(CryptoContext(groupContext.key), nonce, payloadHeader, packetHeader, msgCopy));
672 }
The msg.CloneData()
fails because the pbuf does not fit in the packet buffer's memory model at all. Raised https://github.com/project-chip/connectedhomeip/pull/20923 as a quick fix.
Problem
Derived from #19289 — added some pbuf logging on master and ran all-clusters-app on M5Stack (patch and sample log attached).
Every pbuf handed to
LwIPReceiveUDPMessage()
seems to have a badpayload
pointer, outside the buffer. The pointers vary from run to run but are somewhat consistent within a run; for the same pbuf, the payload pointer is often (but not always) the same. The payload pointers always seem to be moderately close to the pbuf (within 32K), but seem to never be within the buffer.Examples extracted from the attached log —
0001-LOG.patch.txt log.chip-all-clusters-app.20220608181346.txt