microsoft / msquic

Cross-platform, C implementation of the IETF QUIC protocol, exposed to C, C++, C# and Rust.
MIT License
4.07k stars 535 forks source link

Crash on Ubuntu w/loopback MTU of 1500 #4618

Open larseggert opened 1 month ago

larseggert commented 1 month ago

Describe the bug

With lo configured to use an MTU of 1500, i.e., sudo ip link set dev lo mtu 1500, I see a crash:

[Current thread is 1 (Thread 0x7af5d68006c0 (LWP 1596334))]
(gdb) bt
#0  __memcpy_avx_unaligned_erms () at ../sysdeps/x86_64/multiarch/memmove-vec-unaligned-erms.S:265
#1  0x00007af5d7c4bbc4 in memcpy (__len=<optimized out>, __src=<optimized out>, __dest=<optimized out>) at /usr/include/x86_64-linux-gnu/bits/string_fortified.h:29
#2  QuicStreamSendBufferRequest (Stream=Stream@entry=0x7af5d0033db0, Req=Req@entry=0x7af5d0034b60) at /opt/actions-runner/_work/neqo/neqo/msquic/src/core/stream_send.c:500
#3  0x00007af5d7c77e9b in QuicSendBufferFill (Connection=Connection@entry=0x7af5d0017480) at /opt/actions-runner/_work/neqo/neqo/msquic/src/core/send_buffer.c:181
#4  0x00007af5d7c4ba68 in QuicStreamCompleteSendRequest (Stream=Stream@entry=0x7af5d0033db0, SendRequest=0x7af5d00345b0, Canceled=Canceled@entry=0 '\000',
    PreviouslyPosted=PreviouslyPosted@entry=1 '\001') at /opt/actions-runner/_work/neqo/neqo/msquic/src/core/stream_send.c:469
#5  0x00007af5d7c4d46f in QuicStreamOnAck (Stream=0x7af5d0033db0, PacketFlags=..., FrameMetadata=FrameMetadata@entry=0x7af5d001a000)
    at /opt/actions-runner/_work/neqo/neqo/msquic/src/core/stream_send.c:1523
#6  0x00007af5d7c6fc38 in QuicLossDetectionOnPacketAcknowledged (LossDetection=LossDetection@entry=0x7af5d0017ef8, EncryptLevel=EncryptLevel@entry=QUIC_ENCRYPT_LEVEL_1_RTT,
    Packet=0x7af5d0019fa0, IsImplicit=IsImplicit@entry=0 '\000', AckTime=AckTime@entry=679287734904, AckDelay=AckDelay@entry=0)
    at /opt/actions-runner/_work/neqo/neqo/msquic/src/core/loss_detection.c:562
#7  0x00007af5d7c717e8 in QuicLossDetectionProcessAckBlocks (LossDetection=0x7af5d0017ef8, Path=0x7af5d00175b0, Packet=0x7af5d00012b8, EncryptLevel=QUIC_ENCRYPT_LEVEL_1_RTT,
    AckDelay=0, AckBlocks=<optimized out>, InvalidAckBlock=0x7af5d67ff9a0 "", Ecn=0x0) at /opt/actions-runner/_work/neqo/neqo/msquic/src/core/loss_detection.c:1484
#8  0x00007af5d7c71d60 in QuicLossDetectionProcessAckFrame (LossDetection=LossDetection@entry=0x7af5d0017ef8, Path=Path@entry=0x7af5d00175b0,
    Packet=Packet@entry=0x7af5d00012b8, EncryptLevel=EncryptLevel@entry=QUIC_ENCRYPT_LEVEL_1_RTT, FrameType=FrameType@entry=QUIC_FRAME_ACK,
    BufferLength=BufferLength@entry=17, Buffer=0x7af5d000134e "\002@�", Offset=0x7af5d67ff98e, InvalidFrame=0x7af5d67ff9a0 "")
    at /opt/actions-runner/_work/neqo/neqo/msquic/src/core/loss_detection.c:1683
#9  0x00007af5d7c5cf47 in QuicConnRecvFrames (Connection=Connection@entry=0x7af5d0017480, Path=<optimized out>, Packet=Packet@entry=0x7af5d00012b8,
    ECN=ECN@entry=CXPLAT_ECN_NON_ECT) at /opt/actions-runner/_work/neqo/neqo/msquic/src/core/connection.c:4507
#10 0x00007af5d7c5e23f in QuicConnRecvDatagramBatch (Connection=Connection@entry=0x7af5d0017480, Path=<optimized out>, Path@entry=0x7af5d00175b0,
    BatchCount=BatchCount@entry=1 '\001', Packets=Packets@entry=0x7af5d67ffbb0, Cipher=Cipher@entry=0x7af5d67ffbf0 " �\230\201\210#�\004��\037U�\203\221S��",
    RecvState=RecvState@entry=0x7af5d67ffba4) at /opt/actions-runner/_work/neqo/neqo/msquic/src/core/connection.c:5527
#11 0x00007af5d7c5e9b5 in QuicConnRecvDatagrams (Connection=Connection@entry=0x7af5d0017480, Packets=0x0, Packets@entry=0x7af5d00012b8,
    PacketChainCount=PacketChainCount@entry=1, PacketChainByteCount=PacketChainByteCount@entry=47, IsDeferred=IsDeferred@entry=0 '\000')
    at /opt/actions-runner/_work/neqo/neqo/msquic/src/core/connection.c:5762
#12 0x00007af5d7c5eeef in QuicConnFlushRecv (Connection=Connection@entry=0x7af5d0017480) at /opt/actions-runner/_work/neqo/neqo/msquic/src/core/connection.c:5872
#13 0x00007af5d7c61535 in QuicConnDrainOperations (Connection=Connection@entry=0x7af5d0017480, StillHasPriorityWork=StillHasPriorityWork@entry=0x7af5d67ffd70 "")
    at /opt/actions-runner/_work/neqo/neqo/msquic/src/core/connection.c:7624
#14 0x00007af5d7c50303 in QuicWorkerProcessConnection (Worker=Worker@entry=0x5fd3ed26aa40, Connection=0x7af5d0017480, ThreadID=<optimized out>,
    TimeNow=TimeNow@entry=0x7af5d67ffe70) at /opt/actions-runner/_work/neqo/neqo/msquic/src/core/worker.c:574
#15 0x00007af5d7c507c5 in QuicWorkerLoop (Context=0x5fd3ed26aa40, State=0x7af5d67ffe70) at /opt/actions-runner/_work/neqo/neqo/msquic/src/core/worker.c:735
#16 0x00007af5d7c82d64 in CxPlatRunExecutionContexts (Worker=Worker@entry=0x5fd3ed266c70, State=State@entry=0x7af5d67ffe70)
    at /opt/actions-runner/_work/neqo/neqo/msquic/src/platform/platform_worker.c:478
#17 0x00007af5d7c82f8f in CxPlatWorkerThread (Context=0x5fd3ed266c70) at /opt/actions-runner/_work/neqo/neqo/msquic/src/platform/platform_worker.c:576
#18 0x00007af5d749ca94 in start_thread (arg=<optimized out>) at ./nptl/pthread_create.c:447
#19 0x00007af5d7529c3c in clone3 () at ../sysdeps/unix/sysv/linux/x86_64/clone3.S:78

Affected OS

Additional OS information

Linux t-linux64-ms-280 6.8.0-47-generic #47-Ubuntu SMP PREEMPT_DYNAMIC Fri Sep 27 21:40:26 UTC 2024 x86_64 x86_64 x86_64 GNU/Linux

MsQuic version

main

Steps taken to reproduce bug

https://github.com/mozilla/neqo/blob/b40b73c20a591011c53b45a202d48b820e5bd0ff/.github/workflows/bench.yml#L97-L179

Expected behavior

No crash.

Actual outcome

See dump above.

Additional details

No response

larseggert commented 1 month ago

Seems like the MTU needs to be divisible by 16 and >1500, i.e., 1504 works but 1488 doesn't.

nibanks commented 1 month ago

Weird. I don't understand why a memcpy of a buffer would crash here.

larseggert commented 1 month ago

I don't get it either. We're running this inside perf and hyperfine in CI, but that shouldn't matter.