misson20000 / twili

Homebrew debug monitor for the Nintendo Switch.
GNU General Public License v3.0
184 stars 23 forks source link

Streaming coredumps #67

Open misson20000 opened 5 years ago

misson20000 commented 5 years ago

Coredumps stream out of twili, but twib tries to buffer them which takes way too much memory for games. Need to stream them through the whole pipeline.

baconwaifu commented 4 years ago

The buffering alone isn't the primary problem, it's the fact that it buffers it several times over. I've counted at least 5 points where the entire buffer is copied, and the old one persists in memory until it's collected by a return: The vector receiving the response from the device, which is copied to the Response object being returned, then copied(!) into the output Buffer, before being copied again into the vector passed in out<>. there is a point where all three copies exist at once, because the Response object isn't disposed until SendSmartSyncRequestWithoutAssert() returns, it's not cleared when the buffer reads it, and the WrappingHelper creates a 3rd copy of the data when it copies it to the output vector. Then CoreDump() creates yet another copy to return. assuming that whatever calls that outputs it immediately without creating another copy; that's still 4 copies, at an ideal 16GiB total; likely more, since vector makes no guarantees about compactness.

A quick any easy optimization that kills one of the copies is to keep the Buffer from making a copy of Response, since they have the same scope. Just have a Buffer constructor that takes a vector* and uses it as-is without a copy. that cuts it down to 3 or so copies, still too much for for a game, depending on compiler. Only one copy is strictly needed, and that's an implementation detail of Unpack(). Allowing unpack to accept an ostream&, and SendSyncRequestWithoutAssert() to take it's buffer in a parameter would allow for a single buffer with no copying.

A way to do streaming without too much of an overhaul past the above might be to change util::Buffer for thread-safety, and then passing a Buffer object directly to SendSyncRequestWithoutAssert() to be included in the Response as a reference, but only partially populated. the immediate code following can wait for N bytes, and then pass it back to the main code, which streams it out to disk as it comes in.