ulikunitz / xz

Pure golang package for reading and writing xz-compressed files
Other
477 stars 45 forks source link

Missing common APIs like Reader:Close() Writer:Flush() #45

Open evelikov opened 3 years ago

evelikov commented 3 years ago

Greetings everyone,

I'm a little fresh with golang, so hope I am looking at the right place.

Going through the API - it seems like Reader:Close and Writer:Flush are missing. These are fairly common - as you can see in the the golang standard library zlib and zgip.

Is it possible to have these in future release? Thanks in advance

ulikunitz commented 3 years ago

I didn't add those methods by design and it is unlikely that I will add those methods in the future.

The Close method would require clients to add boiler plate code and it wouldn't do anything that the Garbage Collector can do itself. Note that the API doesn't open files directly and handles only io.Reader objects, which don't have a Close function. So a Close function is not required for the API to work properly.

The Flush method might make sense by pushing all buffered data to the underlying writer. I didn't add it because it will not produce a complete xz file, which requires a footer. It also would have a negative effect on the compression ratio because at least an LZMA2 segment would need to be closed. Theoretically the Flush method could close the xz file and start a new one, but that would be quire surprising for the user. I'd rather would like the cient to express that explicitly by using Close and create a new Writer. There is no use case that really requires a Flush method for the xz Writer.

Antoine de Saint-Exupery described the design principle, I follow, best: "Perfection is not achieved, if you cannot add anything anymore, but if you cannot take anything away."

evelikov commented 3 years ago

Is the GC really enough, or the right answer even to handle the lack of Close()?

I'm wondering, if it were that simple the core golang developers would have omitted Close() in a lot of places. Wouldn't they? Are you sure we are no cases where the explicit Close() is the right thing to do? Say, reading thousands of XZ archives or something?

On the Flush() front - doesn't [zstd])https://github.com/DataDog/zstd) and other formats also require a footer? If so, having similar behaviour would be great.

The mentioned design principle is admirable goal, yet omitting the APIs seems to be going against the golang ecosystem - standard library and third party libs alike.

Thanks for for prompt reply and input.

ulikunitz commented 3 years ago

We should discuss these two methods separately.

Close(): Right now there is no need for it, because there is no resource that the garbage collector cannot release itself.

Flush(): The only use case that makes sense is a streaming application, where the user want to make sure that all buffered data is written to the underlying writer (e.g. a socket). While I don't think xz would be a good fit for the use case compared to zstd, I will look into it in the next version. (I plan to release such a version this year.)