Closed in0finite closed 2 months ago
The problem with parsing with multiple threads is that demos are incremental by nature. You need to have read from the most recent string table snapshot/entity snapshot for subsequent entity packets to make sense. They all build on top of one another.
In Source 2 demos, we do have stringtable and entity snapshots every 60 seconds of game time. However you'd still need to seek through the file byte by byte to find exactly where these snapshots are. This will take time in itself. Dividing the demo into parts that you could then decode on other threads would have to take place as a preprocessing step, before you could do any "actual" work. You could then do the bitreading per-thread (the CPU heavy part).
Another issue is that you wouldn't be able to emit events (game events, entity updates, etc) until you'd finished parsing every chunk. So all of the events would come at once. This is unavoidable as, due to the parallel nature of multi-threaded decoding, you don't know when each chunk will finish. It's inherently racy. You now have another problem - you have to buffer all of these events to emit later - consuming memory.
To summarise - it's definitely possible on a technical level. My gut tells me it wouldn't bring big benefits and wouldn't be worth the dev effort (for me personally). Godspeed to anyone willing to give it a shot though!
Ok, so I didn't know that we don't have the actual position of snapshot in file. That will make it slower, but still possible.
But then, if we want seeking, we have to do pre-processing anyway, to find out positions of snapshots.
Another issue is that you wouldn't be able to emit events
You can emit events immediately, they just need to have tick at which they happened (which you can get from current Thread/Worker, no need to extend GameEvent class). User will then add them to his collection (possibly concurrent), and at end of parsing do a sort (if he even needs sorted data). Users usually need a few events, so it won't consume too much memory.
I might actually try to do it, once the seeking support is there.
Library should be able to parse 1 demo using multiple threads at the same time.
This would bring huge performance improvement, which scales linearly with number of CPUs.
The way I imagined it, is to split demo into multiple parts (eg. based on round-start ticks), then send each part to a different thread to process it.
The prerequisite for this is to have seeking support (#47), where each thread would get a snapshot of demo at the tick where it's starting to process.
I think it would reduce the time to process 1 demo file to as low as 0.3 seconds.