Closed xaozai closed 5 years ago
Curious, I thought this was fixed. This is a problem in Water rather than Yggdrasil—I'll try and take another look later.
Unfortunately, the error is still there. And in the log yggdrasil-fixwindows-0.3.5-0192-windows-amd64.exe displays: Build version: 0.3.5-0191
panic: More data is available.
goroutine 15 [running]:
github.com/yggdrasil-network/yggdrasil-go/src/tuntap.(*TunAdapter).reader(0xc000
0adc50, 0x1, 0xc0001b2030)
/home/circleci/project/src/tuntap/iface.go:122 +0xe03
created by github.com/yggdrasil-network/yggdrasil-go/src/tuntap.(*TunAdapter).St
art
/home/circleci/project/src/tuntap/tun.go:160 +0x5ee
Is it possible that we call Read with a buffer that's too small, relative to the tap MTU? If wonder if the issue would go away if we, for example, double the size of the read buffer.
I can reproduce the problem with tap-windows-9.9.2_3.exe
.
And can not reproduce with tap-windows-9.21.2.exe
.
Digged further into this problem.
And it looks like ERROR_MORE_DATA
processing is still incorrect.
From what I can see, ReadFile
retries may be requested infinite times.
My tests shows that while using of 9.9.2_3
driver, packets size equals to 16398 bytes.
Buffer size of 65549 can hold 3 of them.
But if I resize buffer to 65592, it still not helps.
4 packets fits into it, but 5-th fails.
So if your buffer is not infinite, another approach should be used.
By the way, getOverlappedResult
also can return ERROR_MORE_DATA
.
Most likely, resetEvent
can't produce it and checking of its return code for ERROR_MORE_DATA
is not needed.
Reverting to water master fixes this particular problem. But as I remember, reverting will mean crash in different conditions.
This is how crash looks like from packet level:
I will upload this dump if no one will be able to make minimal test program.
(Note lots of retransmissions and other glitches. It's a different topic, but they can be reproduced with https://ygg.thingylabs.io/ and iPerf3)
Reproduction for this bugreport turned out to be very simple. Just send bunch of UDP packets from problematic Windows Yggdrasil instance:
using System.Net;
using System.Net.Sockets;
namespace ConsoleApplication81
{
class Program
{
static void Main(string[] args)
{
Socket sock = new Socket(AddressFamily.InterNetworkV6,
SocketType.Dgram, ProtocolType.Udp);
IPAddress serverAddr = IPAddress.Parse(
"201:6779:c497:82f:68fd:188d:4338:1234");
IPEndPoint endPoint = new IPEndPoint(serverAddr, 10101);
var data = new byte[16336];
for (int i = 0; i < data.Length; i++)
data[i] = (byte)(i ^ (i >> 8));
for (int i = 0; i < 16; i++)
sock.SendTo(data, endPoint);
}
}
}
Test binary: ConsoleApplication81.zip
I'm afraid I really don't know Windows or the Win32 API well enough to understand what the correct API behaviour here is and I haven't been able to find any Microsoft documentation that describes what to do.
Is ERROR_MORE_DATA
a condition that can be returned infinite times across multiple frames? I would have expected that it should only happen as many times as needed to read a single frame. In the case of Yggdrasil, a single frame shouldn't ever exceed 65549 bytes so I don't know why we would ever want/need an infinite buffer.
I don't know it too. But that information can be obtained by debugging and constructing of test programs.
Look at last screenshot.
Firefox sent five 16398-byte packets in a row.
Three of them was stuffed in your 65549-bytes buffer.
On fourth ReadFile
or getOverlappedResult
call (does not matter) error popped out.
By the way, error is not always "more data", "device not functioning" also shows.
But it is the same condition: 16398 bytes can not be fit into 16355 free buffer space.
It may be 9.9.2_3
driver problem.
But 9.21.2
have even more problems!
It fails to transmit large packets.
You can try ping -l 2000
for example.
I have reverted syscalls_windows.go
and can't see any crashes now.
Sadly, I can't reproduce #456 condition anymore.
Most likely, because DC++ hub was permanently shut down and I don't have continuous connection anymore.
But reason may be that #456 bug was fixed not from water side, but from Yggdrasil side.
I'll try to look at the buffer sizes later, but it might be that we are better moving to Wintun anyway in the long run. The OpenVPN TAP driver is full of bugs and I am never sure how many of them we are affected by.
I don't know if anyone else is already looking at that for Water - we haven't had the time to investigate it so far. There's only two developers on the project and neither of us use Windows so we pretty much rely on other people to feed back to us.
I'll try to look at the buffer sizes later
Changing of buffer sizes can only hide the problem. I suggest you to debug iperf3 problem, it may help to understand what is going on. It may be that driver is not related to problem. As I see packet drops and reordering. How Yggdrasil TCP connection can have drops at all?
There are a couple of places where Yggdrasil can drop packets if absolutely necessary—there is actually work on-going in develop
now to reduce that to only the switch since that is where congestion is handled so it may be worth testing those latest develop
builds too.
Yggdrasil never reorders packets within a flow but the switch does try to balance traffic across flows through the switch queues when congestion happens, but this doesn't happen on a non-congested link.
Most likely, this problem was fixed with 0.3.6-0043
version.
Upload to http://[301:fd1:86e3:521c::13]/videos/upload
works fine for me with tap-windows-9.9.2_3.exe
driver.
yggdrasil-develop-0.3.6-0043-windows-amd64.exe yggdrasil-develop-0.3.6-0043-windows-i386.exe
Glad to hear it. If you're happy after a few hours of use then please feel free to close the issue.
The error is reproduced when trying to upload a video file to http://[301:fd1:86e3:521c::13]/