losvedir / transit-lang-cmp

Programming language comparison by reimplementing the same transit data app
MIT License
426 stars 31 forks source link

C# improvements #6

Open tedd opened 2 years ago

tedd commented 2 years ago

Huge PR, so here is some info:

Had a look at the C# version, and there were some obvious issues. So I forked, set up a benchmark project and added progressive improvements to it - since its fun to benchmark things. See TrannetVersions

Benchmark results: https://github.com/tedd/transit-lang-cmp/tree/main/Trannet.Benchmark Also added changes + simplified webapi implementation here: https://github.com/tedd/transit-lang-cmp/tree/main/Trannet

jeremylcarter commented 2 years ago

Great job!

MarkPflug commented 2 years ago

Good choice of CSV library. I approve.

MarkPflug commented 2 years ago

Loading the stop_times.txt file also benefits from enabling string pooling. The data in that file is quite repetitive, and the string pooling allows only loading each unique string once. A quick test on my machine shows ~40% reduction in time to load the file. The StringPool comes from the Sylvan.Common nuget package.

var pool = new Sylvan.StringPool();

var opts = new CsvDataReaderOptions { 
    StringFactory = pool.GetString 
};

UPDATE: after some more experimentation I found this to be wrong. In ASP.NET, under the server GC, the difference, if any, is negligible. My original measurements were in a console app (non-server GC), where the performance difference was significant.