SaculRennorb commented 3 weeks ago

Ok, this is taking way longer than expected. I'm also somewhat busy with other things right now, so I'll just open this pr for scrutinization.

This pr also:

Adds some tests (without any logs, see GW2EIParser.tst\ParsingErrors.cs and GW2EIParser.tst\ParsesWithoutError.cs. I have not fully tested everything after the final rebase, so more commits might follow fixing changes from that.
Introduces Nullability. In general its a very good idea to use it, but i'm not sure if i will find the time to resolve the remaining related issues correctly.
updates to C#12
switches to file scoped namespaces
enabled global default usings

This means it touches basically every file, but don't be afraid: just tell git to ignore whitespaces, that already cuts out 85% of the changes.

Regarding the numbers:

This improves overall processing speed by 15% in average up to 30% for simulation heavy logs (cerus). I'm still working on better benchmarks
I have not measured memory usage yet, but it should be equal or less than before since i mostly just removed copies and streamed iterations.
For some unknown reason the output files are ~0.5% larger. I have not figured out why that is, but it could very well just be slight changes in compression efficiency because of changes in numeric output (we now omit tailing zeros after the dot)

I should also mention that there are some changes like the the explicit naming of variables that can easily be changed back in case you prefer it that way, but for now i left everything the way i was working with it.

I will also paste some notes i wrote down during the whole process, and while you didn't ask for a code review i necessarily read a lot of the codebase so it also has some recommendations at the end:

Some notes on genral changes and preformance optimizations:

Liberal use of the var keyword: This allows easier itteraton, because of the following scenario:
```
int calculate(int a, int b) { /**/ }

int result = calculate(1, 3);
do_something_else(result);
```
If calculate needs to be changed to return a float, all usages also need to be changed because even just storing the temporary result now fails (on in this case even worse; truncates the result). If the code was written as
```
int calculate(int a, int b) { /**/ }

var result = calculate(1, 3); // <-  var use here
do_something_else(result);
```
Then calculate's return type can be changed freely without causing type errors on every usage. This ofcorse entails that do_somethign_else also works with the new type, for for things like numerics and enumerable types this is usually the case.
The only conceivable downside comes up in the case where the new return type does convert into the next input type, but does so unexpectedly. In practice this happens so rarely that it is vastly outweight by the flexibility gained for refactoring.
Replacing List<T> myList = new List<T>(); (or similar) with just List<T> myList = new() or preferably var myList = new List<T>() in local contexts:
This just makes hte code more readable. If hte type is already stated in the same line, there is no need to repeat it again.
Also again this makes it easier to refactor, because only one change is needed instead of two.
File level namespaces:
Namespaces can now be global for a file without needing to wrap the classes etc in braces and add an indentation level.
Replacing new List<T>(), new T[] and similar with just [ ] (and to a lesser extend [ ..other]): This is a new C#12 feature that allows collection initialization regardless of the collection type. This has two main upsides, but unfortunately also one downside:
- This can generate more specific types. for example, doing things like
```
IEnumerable<T> get_elements() => [ ];
```
  generates access to Array<T>.Empty, which is a very efficient type to use in this scenario, as it only gets allocated once for the whole program. Using => new List<T>() instead allocates every time the function is called.
- It makes it easier to refactor, since hte target type can change without the semantics having to change.
- The downside is that
```
var list = new List<T>();
list.AddRange(readOnlyList);
```
  is still more efficient for anything that is not a list as the input type. There is an optimization specifically for List<T> newList = [..oldList]; specifically when oldList is the plain List<T> type, but it does not yet work for even IReadOnlyList. For al other types this spread initialization will enumerate the collection without reserving space in the target first, so if you know the oldList length it is always better to explicitly reserve first before adding the elements.
- I also didn't do it in places where there probably should be a capacity set, but I either don't know it or haven't gotten to setting it.
Replaced Newtonsoft.Json with System.Text.Json: Dotnet now has its own json library, and its really good. It causes almost no allocations and is therefore very fast and memory efficient.
It also is a bit smarter with how numeric output is produced, and I would have expected this to also reduce the output size, but it somehow increased the final output by ~0.5% (after compression) for some reason.
Changed several interfaces from IReadOnlyList to IEnumerable: This is relevant for collections that get filtered and transformed without being used as the list itself. In almost all cases it is better to not create an intermediate list, since it needs to be reserved and reallocated a bunch just to be consumed with another linq query again.
For all cases where the result is used multiple times however the enumerable should be collected first.
This is also quite dependent on the code that generates such a result. If that codereturns a reference to an existing list, we ofcourse just return that list. But if it generates a sequence we return the generator and don't create superfluous copies where it is not necessary.
Changed several abstract output types to concrete ones, specifically changed IReadOnlyList<T> to just List<T>: This is relevant because in many cases functions were returning a readonly list that was created from a list, just to then be consumed by another function that would turn the ro list back into a normal list. This "turning back" requires a complete copy of the whole list, and just plainly wastes cpu cycles. I only did this in cases where the List is constructed as part of the first function call. I chose different approaches in cases where it is part of an internally stored cache and therefore "shared" between all consumers.
Used structs instead of classes:
If there are lots of instances that get created of a specific type (e.g. Segment / Point) that are mostly readonly you want to have that be a struct. Structs don't need to be individually allocated but can live on the stack or in other datatypes, so it saves a lot of garbage collection and memory to have them be a struct.
Unfortunately there are som cases where a struct cannot be used, or is actually degrading performance. The biggest example are modifying list elements, as list[3].value = 3 does not work if the generic type of the list is a struct type. The whole struct needs to be replaced.
For such use cases there unfortunately is no good solution in net standard except implementing a list type with ref indexing (which im not going to do).
Implemented own sort algorithm (kinda):
I added a port of fluxsort to the repo, a stable sorting algorithm with restrictable and predictable memory usage. The implementation uses ArrayPooling where possible, but I went for speed over memory. The pooling is not ideal, but the best solution i could come up with for the port.
Initialized lists to known or estimated lengths to prevent reallocations.
Changed properties to fields wherever possible: This is very easy. Properties bad. There is no point in them existing. They add an additional layer of function calls to everything and refactoring code is trivially easy. If you don't really need them, don't use them. the only reason to have them is if you want to use the binding system of Winforms or WPF, but those are limited and should not be the default.

Recommendations and future work:

Code style / architecture:

Move graph generation out of the parser. In general the architecture of the library is very questionable. Generating a graph is not a parser task.
Clean up the comments: a lot of comments are just empty or pointless. Just remove empty ones, they are only noise when reading code.
Leave empty lines! The // instead of empty lines breaks git diffing and merging. I had to do multiple 300 lines+ manual diffs where only a few liens changed during the final rebase, just because they were all glued together. I also don't quite understand where this even comes form, as empty liens are used for visual and mental separation of code blocks. Gluing the blocks together with empty comments not only introduces more noise, but it doesn't serve the original purpose of spacing out conceptually different segments since your eye glaze over the "spaces" because they are not empty.
Rename Abstract.. types to just the name. So instead of AbstractDecoration just call it Decoration. There is no value in this prefix. Same goes for Generic....

Tests:

Add actual tests. At least one test of each encounter that gets special treatment, preferably additional ones for know edgecases. Each time a bug is reported and fixed add a regression test for that bug.
Add fuzztesting for the parser. This is a bit more complicated, but the evtc should easily be able to be simulated using something like https://github.com/nathaniel-bennett/winsharpfuzz.

Performance:

Compress the html css and html output in general.
Stream the html output.
~~Replace other small structures like Point3d with structs.~~

SaculRennorb commented 2 weeks ago

Just a short remark on how I arrived at the performance improvements:

Some of them are obvious, but i never pull them out of thin air. If i'm uncertain about a change I first test out the generated code in sharplab.

For example https://sharplab.io/#v2:D4AQTAjAsAUCAMACEEAsBuWIDMyyIGFEBvWRcxAGQEsBnAFwB5qA7egPkQHEBTegVRbUAjgFceAOQAKBAJIARWgAoAlGQqkYFbcgDs6nYgDaBwxQgAaRGCvYrqU9oC6mLRQC+p2QCUeAQwATAHkWABsATxoGZjZOXgEhMUkZBVoICFVTTTM9RwoTNxzyS2tbezzyF1NPQsRWeiMnbj5BEXFpOUUwTNrssxB9Wu0CovMrG0Q7RAchiiramu1fQJCIgGUABz8WGI5mhLbkztpsHu0+wwGK42uKAFYrADYrXSsADmv57Rr3IA==

Is a comparison between the initialization code that gets generated for several comparable methods that just return a list of numbers, showing that for the readonly case the readonly span is vastly superior to all other variants

EliphasNUIT commented 1 week ago

Thanks!

EliphasNUIT commented 1 week ago

To be investigated: ArrayPool seems to allocate too much memory, memory useage increases up to a point while re-parsing a single file.

SaculRennorb commented 6 days ago

940 just linking this as the resolution to the memory thing, in case someone finds this later on

baaron4 / GW2-Elite-Insights-Parser

[WIP] Perf. #933

Regarding the numbers:

Some notes on genral changes and preformance optimizations:

Recommendations and future work:

940 just linking this as the resolution to the memory thing, in case someone finds this later on