lidgren / lidgren-network-gen3

Lidgren Network Library
https://groups.google.com/forum/#!forum/lidgren-network-gen3
MIT License
1.19k stars 331 forks source link

Less GC when reading floats and doubles thanks to a pool of byte[] + Removed a unnecessary array creation and destruction #70

Closed forestrf closed 6 years ago

forestrf commented 8 years ago

I was having a little issue with some kilobytes of gc because I was reading floats that where displaced (not byte aligned) in the buffer.

To solve it, I wrote a small Pool that only is used by the netbuffer (SimpleArrayPoolThreadSafe), created an object of it on a static variable and modified the single and double read methods. Now there is not garbage from them. I did not used the pool in the string read method because the pool is too simple and may generate a lot of arrays of different size for nothing. See @Inverness comment

The method to read strings still generates garbage Readstring will generate garbage only if it doesn't fit on the saved buffer

The removed unnecessary array creation+destruction is due to the array being only readed inside an internal method, so no copies of it are needed

fversnel commented 8 years ago

Forgive me for this uninformed question but what is the reason we need a pooled buffer at all to read from the BitConverter? Can't we just do:

BitConverter.ReadDouble(buffer, m_readPosition)

As long as the buffer's position is in whole bytes then this will just work I think. For the cases that you're working with bits you can use the existing code.

fversnel commented 8 years ago

Ah, I see from the source that this is already the case and you're trying to solve the problem in which the buffer is not byte aligned.

forestrf commented 8 years ago

I edited the first post to say that it is not byte aligned. I wrote displaced because I didn't know what word was to say that

fversnel commented 8 years ago

Perhaps you could use a ConcurrentDictionary instead of a lock around the regular dictionary. It's just a suggestion, no requirement or anything. I'm wondering if you've thought about it.

Inverness commented 8 years ago

@forestrf Why are all these asserts being replaced with if statements and exception throws?

The assert method has the Condition("DEBUG") attribute on it. This means the expression invoking that method and any expressions for the arguments are only evaluated on debug builds. On release builds neither the conditions nor any string concatenations in the message are included in the compiled code.

The way to reduce garbage collection for debug builds in cases like that is to offer an overload of Assert() that does string formatting so that the actual format only happens if the assertion does not pass.

As for the initial reason for this merge request. Rather than using a pool with locking (which impacts single-threaded performance) I suggest you simply use a static variable holding a small eight byte array that is acquired using Interlocked.Exchange(), used, and then release by just setting the field directly. Since this operation can't be nested there isn't any reason to use a stack.

This strategy means that in the majority of cases you'll have your buffer array available without locking especially if these operations only happen on a single thread. Worst case scenario is that you end up having to allocate an extra array if the static field is null, which will only happen on first use or if two threads are both at the same part of reading an unaligned number which is quite unlikely.

Edit: Example of what I mean:

private static byte[] s_buffer;

public float ReadSingle()
{
    NetException.Assert(m_bitLength - m_readPosition >= 32, c_readOverflowError);

    if ((m_readPosition & 7) == 0) // read directly
    {
        float retval = BitConverter.ToSingle(m_data, m_readPosition >> 3);
        m_readPosition += 32;
        return retval;
    }

    byte[] buffer = Interlocked.Exchange(ref s_buffer, null) ?? new byte[8];
    ReadBytes(buffer, 0, 4);
    float result = BitConverter.ToSingle(buffer, 0);
    s_buffer = buffer;
    return result;
}

Notice that the allocated array is 8 bytes. The buffer should be large enough for any of the methods that would want to use the buffer. Optionally you can add additional fields for buffers of other sizes, but that seems unnecessary as this is already a rare case.

forestrf commented 8 years ago

My bad, I only wanted to add two commits, not all of them. I changed this to use the implementation you gave as it is better, thanks. Also, now is only one commit

Inverness commented 8 years ago

@forestrf You could use the same buffer for strings too if you needed. Just increase the buffer size to a size that would accommodate most use cases. The string method can just check if byteLen exceeds that and either use the buffer or allocate a new buffer.

Finally, I forgot to do this in my previous example but I recommend making the size for the buffer allocation a constant.

Updated the end of the ReadString() method:

if (byteLen <= c_bufferSize)
{
    byte[] buffer = Interlocked.Exchange(ref s_buffer, null) ?? new byte[c_bufferSize];
    ReadBytes(buffer, 0, byteLen);
    string retval = Encoding.UTF8.GetString(buffer, 0, byteLen);
    s_buffer = buffer;
    return retval;
}
else
{
    byte[] bytes = ReadBytes(byteLen);
    return Encoding.UTF8.GetString(bytes, 0, bytes.Length);
}

Don't forget to replace the array length in the other parts of the code with c_bufferSize if you decide to do it like this.

forestrf commented 8 years ago

Done!

Inverness commented 8 years ago

@forestrf For reference, I learned about lock-free optimizations by reading some .NET Framework and Roslyn code like the ObjectPool class. That one might help you in the future.

HellfireDrew commented 8 years ago

This change doesn't seem to work with AOT compilation. We experience this exception running with Unity3D on PS4:

ExecutionEngineException: Attempting to JIT compile method '(wrapper managed-to-native) System.Threading.Interlocked:Exchange (byte[]&,byte[])' while running with --aot-only. at Lidgren.Network.NetBuffer.ReadString () [0x00000] in NetBuffer.Read.cs:630

It also happens everywhere else that the Interlocked.Exchange is used.

I found some posts on the net about a workaround for this:

var dummyTask = new Task (() => {});
var dummy = System.Threading.Interlocked.Exchange (ref dummyTask, System.Threading.Tasks.Task.Factory.StartNew (() =>{}));

However, the System.Threading.Tasks namespace doesn't exist in Unity3D's mono using .NET 2.0.

forestrf commented 8 years ago

I know nothing about AOT and I don't know how can I test without a console or an iPhone. A simple #IF AOT and some code would fix, but how can I know? :/

HellfireDrew commented 8 years ago

Mostly just commenting to bring attention to it, because I know you have a Unity branch, and Unity supports a lot of AOT platforms. We already made a quick fix by just reverting this commit on our build. I'll have to look into a proper AOT alternative to Interlocked.Exchange later.

Inverness commented 8 years ago

I believe this issue is because it is using the generic Exchange() method with the byte[] instead of a non-generic method.

I don't know the details about getting generics to work with AOT so I can't help you there.

An alternate solution though is to change the type of the s_buffer field to object so that the non-generic object overload is used instead. The code that uses it would then have to cast it to byte[] first before using it.

Edit: An example

private static object s_buffer;

...

// Avoid use of generic Exchange<T>(ref T, T) overload to ensure AOT compatibility.
object bufferObject = Interlocked.Exchange(ref s_buffer, null) ?? new byte[8];
byte[] buffer = (byte[]) bufferObject;
ReadBytes(buffer, 0, 4);
float result = BitConverter.ToSingle(buffer, 0);
s_buffer = buffer;
return result;
HellfireDrew commented 8 years ago

After doing some more research, I found a few places where people were saying that is exactly what fixed it for them: using the object overload. I'll give it a try when I get a chance.

forestrf commented 8 years ago

@Inverness I modified it as you pointed to use object instead of byte[] and it works, so I suppose now it should not have problems with AOT compilation

RevoluPowered commented 6 years ago

@forestrf can you possibly send this pull request to my repo?

I've been patching some bugs I have with lidgren and would like the fixes. :)