jpdillingham / Soulseek.NET

A .NET Standard client library for the Soulseek network.
GNU General Public License v3.0
175 stars 24 forks source link

Search for optimization opportunities in the distributed broadcast logic #773

Open jpdillingham opened 1 year ago

jpdillingham commented 1 year ago

I've been watching slskd metrics after upgrading to .NET 7, and one thing that is really jumping out at me is the correlation between garbage collection frequency, CPU utilization, and the number of distributed children:

image

The broadcast logic accounts for 100% of this activity, since it is the only part of the library that scales with the number of distributed children. There isn't much to the code, either:

        /// <summary>
        ///     Asynchronously writes the specified bytes to each of the connected child connections.
        /// </summary>
        /// <param name="bytes">The bytes to write.</param>
        /// <param name="cancellationToken">The token to monitor for cancellation requests.</param>
        /// <returns>The operation context.</returns>
        public async Task BroadcastMessageAsync(byte[] bytes, CancellationToken? cancellationToken = null)
        {
            cancellationToken ??= CancellationToken.None;

            async Task Write(KeyValuePair<string, Lazy<Task<IMessageConnection>>> child, byte[] bytes, CancellationToken? cancellationToken)
            {
                IMessageConnection connection = default;

                try
                {
                    connection = await child.Value.Value.ConfigureAwait(false);

                    if (connection.State == ConnectionState.Connected)
                    {
                        await connection.WriteAsync(bytes, cancellationToken).ConfigureAwait(false);
                    }
                }
                catch (Exception ex)
                {
                    connection?.Disconnect($"Broadcast failure: {ex.Message}");
                }
            }

            var sw = new Stopwatch();
            sw.Start();

            var tasks = new List<Task>();

            foreach (var child in ChildConnectionDictionary)
            {
                tasks.Add(Write(child, bytes, cancellationToken));
            }

            await Task.WhenAll(tasks).ConfigureAwait(false);

            sw.Stop();

            if (!AverageBroadcastLatency.HasValue)
            {
                AverageBroadcastLatency = sw.ElapsedMilliseconds;
            }
            else
            {
                // EMA
                AverageBroadcastLatency = ((sw.ElapsedMilliseconds - AverageBroadcastLatency) * LatencyAlpha) + AverageBroadcastLatency;
            }
        }

Take a look at this code, as well as the underlying WriteAsync method, to see if there are any opportunities to reduce CPU or memory overhead (without losing functionality)

jpdillingham commented 1 year ago

I should note that performance is actually very good considering the amount of work being done here, and it's pretty rare for me to get this many children, so this is by no means a pressing issue for me or anyone else, but it's still worth a look.