Closed Tromador closed 5 years ago
IF you're pointing at what I think you're pointing at, there's nothing to be done about it. The listings exporter takes very little time to grab all the listings (2.75 seconds in that snippet above), but then it has to process them all and write the file. Since it doesn't need the DB at that point, it releases the busy lock, but it's still doing work, which means it's eating CPU. In that snippet, Robigo Mines and Stein Terminal were being updated while the exporting was happening, and the exporting finished while Lindstrand Gateway was being updated, which is why those three took longer than average- CPU was being divided between exporting the listings and updating those stations.
CPU wasn't divided. Each thread is running on a separate core of a largely unloaded server.
But sure - ok. I sat and watched what happened. It was slow whilst writing the file, as soon as the file was written then it ran a few updates mega fast (catching up) and then went back to normal.
Is there anything we can do to speed up the file write then? It's not the hardware, (fibre channel raid), I assure you.
Maybe?
Right now it writes each entry directly to the file in a for loop. We could instead write to a string, and then write the string to the file. That means just one very large write, rather than a bunch of single line writes.
I don't know how much time savings that would be, but I do know it'd be a potentially very large increase in memory usage, since that string would be stored in memory, and would be just as large as the listings file.
I've updated the debug branch to do this, feel free to test it.
As I write, listings-live.csv is 15MB long. Let's say in my wildest nightmares it gets to 50MB long, then RAM is never going to be an issue (unless something badly breaks).
It is, however 342,000 (and change) lines long, so that's 342,000 (and change) I/O operations.
On balance, I would guess that doing it in RAM would be quicker - certainly if you are willing, I'd like to try it.
EDIT: I'll test the debug then :)
The change is on the debug branch. I've not tested it yet, the test is still doing a plugin import.
Astonishingly - it's slower to do it all in memory.
When it does write the file, it's instantaneous, but putting it together in memory is slower. I'm genuinely surprised.
Well - never mind then. If I should come across some super efficient way of doing it which we hadn't considered, I'll let you know. Thanks for looking at it.
No worries.
Bernd notes in https://github.com/eyeonus/EDDBlink-listener/issues/7
"If you're running in WAL journal there is no need to stop the updater while exporting. WAL allowes multiple reader and one writer. Only the EDDB update can't run at the same time."
So I'm reopening this. I am running the server in WAL. Once I'm happy with the database tunings I'm running on server, we can push them to the main TD and potentially remove the exporter's busy signal.
Sounds good to me.
See the below example. I suspect the sleep needs moving (or another adding somewhere) to stop the loop (or maybe a different loop) going back into runaway mode when exporting.
It's not causing us to fall behind the queue like before, but it would be nice if we could clear up this last bit of the server slows.