quantumtunneling / T-Digest.NET

.NET Implementation of the T-Digest quantile estimation algorithm. Useful for calculating Quantiles or Percentiles from streaming data, or data-sets that are too large to store in memory and sort, which is required to calculate the true quantile.
25 stars 8 forks source link

Performance issue #6

Open bobbymcr opened 4 years ago

bobbymcr commented 4 years ago

I was interested in percentile estimation and learned about t-digest while researching the topic. I was happy to find this project since I work primarily in .NET platforms. Thanks for building this!

According to the original Java t-digest project, "the particularly interesting characteristics of the t-digest are that it [...] is very fast (~ 140 ns per add)". I wanted to see if the .NET version lived up to this measurement, so I built a simple set of benchmarks and got the following results:

Method N Asc Mean Error StdDev Gen 0 Gen 1 Gen 2 Allocated
BuildSorted 1000 False 2.076 ms 0.0346 ms 0.0307 ms 136.7188 66.4063 19.5313 847.66 KB
BuildSorted 1000 True 17.267 ms 0.0808 ms 0.0675 ms 125.0000 - - 847.71 KB
BuildSorted 10000 False 134.297 ms 0.9801 ms 0.8185 ms 3500.0000 1750.0000 500.0000 23075.98 KB
BuildSorted 10000 True 391.343 ms 2.2634 ms 1.8901 ms 3000.0000 1000.0000 - 23062.74 KB
BuildSorted 100000 False 1,504.426 ms 12.3338 ms 10.2992 ms 40000.0000 20000.0000 6000.0000 248582.73 KB
BuildSorted 100000 True 4,663.984 ms 36.5801 ms 34.2170 ms 40000.0000 20000.0000 5000.0000 250178.06 KB

These numbers show anywhere from 2000 to over 46000 ns per add which is not very competitive with the Java version. :-( Running a profiling session, I see the hottest function is ComputeCentroidQuantile which does a linear search over the centroid list to calculate the cumulative sum up to the given mean. This function is called up to two times for every add operation.

I haven't studied the Java code extensively, but it seems that version avoids this expense by precomputing aggregated counts within an AVL tree structure, implying more of an O(log(N)) amortized cost as opposed to O(N). Would it be worth doing a more direct port of the Java code to see if the performance improves?

quantumtunneling commented 4 years ago

Hey thanks for the analysis. So I am aware that the performance of this particular implementation isn't very good, and this is mostly due to the data structure that holds the centroids. What we need is a custom AVL tree implementation with a particular function that I cant remember off the top of my head (dont have access to my computer at the moment). I was working on an AVL tree implementation for this purpose, but I never finished it unfortunately. I might just take another look though as there has been quite a bit of interest in this project recently. Thanks,

Graham


From: Brian Rogers notifications@github.com Sent: Sunday, December 22, 2019 4:08:55 PM To: quantumtunneling/T-Digest.NET T-Digest.NET@noreply.github.com Cc: Subscribed subscribed@noreply.github.com Subject: [quantumtunneling/T-Digest.NET] Performance issue (#6)

I was interested in percentile estimation and learned about t-digest while researching the topic. I was happy to find this project since I work primarily in .NET platforms. Thanks for building this!

According to the original Java t-digest projecthttps://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Ftdunning%2Ft-digest&data=02%7C01%7Cghenry%40adobe.com%7Ce6c52ef3905d4b76e1bc08d7873c4dcc%7Cfa7b1b5a7b34438794aed2c178decee1%7C0%7C0%7C637126565392554635&sdata=3NgZj%2Bt9UT%2FTm%2FjCigDFlOKmdGOKsOvivs%2Fj%2F0nCrCk%3D&reserved=0, "the particularly interesting characteristics of the t-digest are that it [...] is very fast (~ 140 ns per add)". I wanted to see if the .NET version lived up to this measurement, so I built a simple set of benchmarkshttps://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fbobbymcr%2FT-Digest.NET%2Fcommit%2F581e3e9e6a8253491873da6997c4d16134abdafa&data=02%7C01%7Cghenry%40adobe.com%7Ce6c52ef3905d4b76e1bc08d7873c4dcc%7Cfa7b1b5a7b34438794aed2c178decee1%7C0%7C0%7C637126565392564631&sdata=zpRgY%2FzlIzYKSCa80L8pZ9uuWGSHfG7sjtM1aL2JbsE%3D&reserved=0 and got the following results:

Method N Asc Mean Error StdDev Gen 0 Gen 1 Gen 2 Allocated BuildSorted 1000 False 2.076 ms 0.0346 ms 0.0307 ms 136.7188 66.4063 19.5313 847.66 KB BuildSorted 1000 True 17.267 ms 0.0808 ms 0.0675 ms 125.0000 - - 847.71 KB BuildSorted 10000 False 134.297 ms 0.9801 ms 0.8185 ms 3500.0000 1750.0000 500.0000 23075.98 KB BuildSorted 10000 True 391.343 ms 2.2634 ms 1.8901 ms 3000.0000 1000.0000 - 23062.74 KB BuildSorted 100000 False 1,504.426 ms 12.3338 ms 10.2992 ms 40000.0000 20000.0000 6000.0000 248582.73 KB BuildSorted 100000 True 4,663.984 ms 36.5801 ms 34.2170 ms 40000.0000 20000.0000 5000.0000 250178.06 KB

These numbers show anywhere from 2000 to over 46000 ns per add which is not very competitive with the Java version. :-( Running a profiling session, I see the hottest function is ComputeCentroidQuantile which does a linear search over the centroid list to calculate the cumulative sum up to the given mean. This function is called up to two times for every add operation.

I haven't studied the Java code extensively, but it seems that version avoids this expense by precomputing aggregated counts within an AVL tree structure, implying more of an O(log(N)) amortized cost as opposed to O(N). Would it be worth doing a more direct port of the Java code to see if the performance improves?

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHubhttps://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fquantumtunneling%2FT-Digest.NET%2Fissues%2F6%3Femail_source%3Dnotifications%26email_token%3DAARU2P6OY5NHOWKWW2GMYHLQZ76RPA5CNFSM4J6OFBXKYY3PNVWWK3TUL52HS4DFUVEXG43VMWVGG33NNVSW45C7NFSM4ICG5ICQ&data=02%7C01%7Cghenry%40adobe.com%7Ce6c52ef3905d4b76e1bc08d7873c4dcc%7Cfa7b1b5a7b34438794aed2c178decee1%7C0%7C0%7C637126565392564631&sdata=LYFZez4pB7vOhS34WzmWaEa%2BJ5sAqsc%2FMB7GN0Jufhs%3D&reserved=0, or unsubscribehttps://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fnotifications%2Funsubscribe-auth%2FAARU2PY4PGCJOE6HU5TLBQLQZ76RPANCNFSM4J6OFBXA&data=02%7C01%7Cghenry%40adobe.com%7Ce6c52ef3905d4b76e1bc08d7873c4dcc%7Cfa7b1b5a7b34438794aed2c178decee1%7C0%7C0%7C637126565392574626&sdata=H0PjLvgBPO4l78lHD7MrNAf7iymWNXvf6XuCaCV1No4%3D&reserved=0.