SpectralSequences / sseq

The root repository for the SpectralSequences project.
Apache License 2.0
25 stars 10 forks source link

Support writing compressed save data #83

Open JoeyBF opened 2 years ago

JoeyBF commented 2 years ago

It works but is much slower for low stems (7x). I would expect it to be asymptotically negligible. Already going from stem 100 to stem 101 takes only 61% longer.

One idea would be to have a static ext::save::COMPRESSION_LEVEL: AtomicI32 that would control the compression level dynamically, and we could adjust it as the stems get larger.

Resolves #82

dalcde commented 2 years ago

We would need to update the delete-empty-file code to deal with this as well.

(also, can you put the #[allow(unused_mut)] on the let mut p line instead of the whole function? Not sure if this is a thing)

JoeyBF commented 2 years ago

Did you have something like this in mind? It looks clean to me but I might have overengineered it

dalcde commented 2 years ago

Do we know that if we interrupt a program writing with zstd compression, we end up with an empty file? e.g. it seems plausible that we have a zstd header with no contents.

JoeyBF commented 2 years ago

Yeah you're right. Even if the program crashes right at the end of create_file it still creates a 25-byte compressed file, containing the 16-byte header. Compressing an empty file still gives 13 bytes, which I suppose is the zstd header.

JoeyBF commented 2 years ago

Actually, why is this not a problem for uncompressed files? Shouldn't they always at least contain the header?

dalcde commented 2 years ago

I would guess it's got to do with whether/when the Encoder flushes

JoeyBF commented 2 years ago

I think then we should immediately flush at the end of create_file to make sure the file is never empty, and then we can panic when open_file encounters an empty file, because it would never happen unless something went very wrong. We can move the emptiness check to the SaveFile::open_file with the header verification

dalcde commented 1 year ago

We already tend to write almost immediately after open_file. It would be useful to figure out what exactly is going on.