dotnet / runtime

.NET is a cross-platform runtime for cloud, mobile, desktop, and IoT apps.
https://docs.microsoft.com/dotnet/core/
MIT License
15.44k stars 4.76k forks source link

TCP Socket - Connection reset while receiving data #108421

Open Santoshramani opened 1 month ago

Santoshramani commented 1 month ago

Description

I have TCP socket to receive IP camera stream(Playback data) via NVR. Implementation of data read from socket was working fine in xamarin native build but now in dotnet 8 build as soon as app starts receiving data from socket connection getting reset as seen in pcap. My finding is this is because of socket buffer size is 2kb(as per mSocket.ReceiveBufferSize returns) only. Also setting expected receive buffer size to socket does't helping.

Reproduction Steps

Create bulk data reading socket using dotnet 8 SDK for android project.

Expected behavior

Socket should not reset connection when transferring bulk data.

Actual behavior

Socket connection getting reset when transferring bulk data.

Regression?

Socket was working fine for bulk data in xamarin native build.

Known Workarounds

Increasing receive buffer size of socket may help.

Configuration

.NET SDK:
 Version:           8.0.401
 Commit:            811edcc344
 Workload version:  8.0.400-manifests.56cd0383
 MSBuild version:   17.11.4+37eb419ad

Runtime Environment:
 OS Name:     Mac OS X
 OS Version:  14.6
 OS Platform: Darwin
 RID:         osx-arm64
 Base Path:   /usr/local/share/dotnet/sdk/8.0.401/

.NET workloads installed:
Configured to use loose manifests when installing new manifests.
 [ios]
   Installation Source: SDK 8.0.400
   Manifest Version:    17.5.8020/8.0.100
   Manifest Path:       /usr/local/share/dotnet/sdk-manifests/8.0.100/microsoft.net.sdk.ios/17.5.8020/WorkloadManifest.json
   Install Type:        FileBased

 [android]
   Installation Source: SDK 8.0.400
   Manifest Version:    34.0.113/8.0.100
   Manifest Path:       /usr/local/share/dotnet/sdk-manifests/8.0.100/microsoft.net.sdk.android/34.0.113/WorkloadManifest.json
   Install Type:        FileBased

Host:
  Version:      8.0.8
  Architecture: arm64
  Commit:       08338fcaa5

.NET SDKs installed:
  8.0.303 [/usr/local/share/dotnet/sdk]
  8.0.401 [/usr/local/share/dotnet/sdk]

.NET runtimes installed:
  Microsoft.AspNetCore.App 6.0.32 [/usr/local/share/dotnet/shared/Microsoft.AspNetCore.App]
  Microsoft.AspNetCore.App 7.0.20 [/usr/local/share/dotnet/shared/Microsoft.AspNetCore.App]
  Microsoft.AspNetCore.App 8.0.0 [/usr/local/share/dotnet/shared/Microsoft.AspNetCore.App]
  Microsoft.AspNetCore.App 8.0.7 [/usr/local/share/dotnet/shared/Microsoft.AspNetCore.App]
  Microsoft.AspNetCore.App 8.0.8 [/usr/local/share/dotnet/shared/Microsoft.AspNetCore.App]
  Microsoft.NETCore.App 6.0.32 [/usr/local/share/dotnet/shared/Microsoft.NETCore.App]
  Microsoft.NETCore.App 7.0.20 [/usr/local/share/dotnet/shared/Microsoft.NETCore.App]
  Microsoft.NETCore.App 8.0.0 [/usr/local/share/dotnet/shared/Microsoft.NETCore.App]
  Microsoft.NETCore.App 8.0.7 [/usr/local/share/dotnet/shared/Microsoft.NETCore.App]
  Microsoft.NETCore.App 8.0.8 [/usr/local/share/dotnet/shared/Microsoft.NETCore.App]

Other architectures found:
  None

Environment variables:
  Not set

global.json file:
  Not found

Other information

No response

Santoshramani commented 1 month ago

@karelz @vitek-karas @akoeplinger please look into this issue.

rzikm commented 1 month ago

@Santoshramani Please don't tag individual people. The bot takes care of notifying the right team.

My finding is this is because of socket buffer size is 2kb(as per mSocket.ReceiveBufferSize returns) only.

How do you know this? Does increasing the receive buffer size reliably resolve the issue?

in dotnet 8 build as soon as app starts receiving data from socket connection getting reset as seen in pcap.

Is the socket being reset by the remote peer (the camera?) or locally (by the application)? Can you share the network captures for us to investigate?

Santoshramani commented 1 month ago

@rzikm

Please don't tag individual people. The bot takes care of notifying the right team.

I need solution as soon as possible that's why i've tagged them, because i've to upload my app to playstore as soon as possible due to their policies.

How do you know this? Does increasing the receive buffer size reliably resolve the issue?

Yes i've tried increasing receive buffer size but that doesn't had any effect in performance.

Is the socket being reset by the remote peer (the camera?) or locally (by the application)? Can you share the network captures for us to investigate?

Socket is being reset by the application. Sorry i can't share pcap as i don't have it now but When socket gets reset at that time packet just before reset is 'TCP Window Full'.

vitek-karas commented 1 month ago

Could you please try this with .NET 9 RC 1 - it's very possible we fixed it there - https://github.com/dotnet/runtime/pull/104726.

Santoshramani commented 1 month ago

@vitek-karas Yes i've already tried using SDK 9 RC 1 but also not working with it.

vitek-karas commented 1 month ago

/cc @simonrozsival

simonrozsival commented 1 month ago

@Santoshramani hi! would you be able to create a repro project? I'm not able to reproduce this issue on my end based on the issue description.

wfurt commented 1 month ago

note that the socket buffer size should not impact the behavior. When it is getting full it will shrink TCP windows size. Also is there SSL involved or was this only plain socket? And if you use async mace cure you handle cases when the read is finished synchronously - something that did not happen (often) with older .NET releases.

karelz commented 1 month ago

@Santoshramani

I need solution as soon as possible that's why i've tagged them, because i've to upload my app to playstore as soon as possible due to their policies.

We do not provide "online" / ASAP support on GitHub. If you have (paid) official Microsoft support, please reach out to them. We can try to do our best, but you should not expect magic and 24/7 responses. I would highly recommend to look into workarounds - like keeping your old app version in the store before your upgrade to .NET 8.

Santoshramani commented 1 month ago

@simonrozsival Sorry but i can not provide repro project but i can share my android application build(APK) to you but to check issue app needs NVR connection and currently i do not have NVR in internet so sharing APK also can't help you.

Santoshramani commented 1 month ago

@wfurt My socket is non-SSL only and i'm using BeginReceive() and EndReceive() methods to start and end receiving data from socket. This was working fine with xamarin.

Santoshramani commented 1 month ago

@karelz I will definitely contact microsoft support team if couldn't found any solution here in my buffer time.

wfurt commented 1 month ago

so definetly not related. to https://github.com/dotnet/runtime/pull/104726. The original description has reference to pcap but I don;t see anything attached. Getting both the working and not working case could be useful.

Santoshramani commented 1 month ago

Currently when issue is generated, there is this line in app logs. [libc] Fatal signal 11 (SIGSEGV), code 1 (SEGV_MAPERR), fault addr 0x777746401f in tid 30722 (.NET TP Worker)

Socket pcap are also clean now there is nothing like 'socket reset' which i've mentioned earlier.

Can anyone please explain what exactly is the problem.

dotnet-policy-service[bot] commented 1 month ago

Tagging subscribers to 'arch-android': @vitek-karas, @simonrozsival, @steveisok, @akoeplinger See info in area-owners.md if you want to be subscribed.

wfurt commented 1 month ago

That may be sign of corrupted memory. It is going to be very difficult to investigate IMHO with reasonable repro. You can try to craft simple "bulk data reading socket using dotnet 8 SDK for android project." app that demonstrates this. I would expect that the actually data does not matter e.g. you could stub out the camera feed.