brendan-duncan / archive

Dart library to encode and decode various archive and compression formats, such as Zip, Tar, GZip, ZLib, and BZip2.
MIT License
400 stars 140 forks source link

Reusing `AchiveFile` can result in corrupted archives #334

Open brianquinlan opened 5 months ago

brianquinlan commented 5 months ago

Here are some patterns that previously worked with package:archive that no longer with 3.5.1

final destinationArchive = Archive();
final sourceArchive = ZipDecoder().decodeBytes(bytes);
for (final file in sourceArchive.files) {
  destinationArchive.addFile(file);  // content not set, destinationArchive is corrupted
}

I think that this is the problematic code

final archive = ...
ZipEncoder.encode(archive)
final destinationArchive = Archive();
destinationArchive.addFile(archive.findFile('foo')). // encode closed the file, destinationArchive is corrupted

It would be nice if, in these cases:

  1. it would be allowed to use an ArchiveFile in multiple contexts OR
  2. using it in a disallowed context resulted in a StateError rather than a corrupted archive

I discovered these issues in the context of the Google build system, where package:archive is extensively used (thanks!). I can extract minimum reproductions and/or tests, if that would be helpful.

brendan-duncan commented 5 months ago

On it.

brendan-duncan commented 5 months ago

Also, I am currently working on a 4.0 branch, https://github.com/brendan-duncan/archive/tree/4.0. I'll make sure the fix for this gets into that branch. If there are any breaking changes for the build system in the 4.0 branch, if that's even possible to check, I can make sure they are looked into before I ship it.

brendan-duncan commented 5 months ago

The issue in 3.5.1 isn't adding an ArchiveFile to another Archive, they can exist in multiple. There is a bug with encoding a new zip before the the data of the file was loaded. I'll get that fixed up asap.

This issue also revealed a significant design flaw in my 4.0 branch, so I got some more work to do there.

brendan-duncan commented 5 months ago

I found the issue. autoClose named arg was added to encode, and set to true. It was intended to free up memory when encoding an archive. Maybe not such a good idea after all, since it breaks this behavior that I didn't anticipate.

I can switch it to false to maintain the previous behavior. ZipEncoder().encode(archive, autoClose: false); is manually using the previous behavior.

brianquinlan commented 5 months ago

Could changing the behavior back not break people relying on the new behavior?

brendan-duncan commented 5 months ago

Changing it to false wouldn't break anything, unless they are relying on the fact that the memory was freed on encode, but it wouldn't break any logic.

The whole logic of ZipFile content is a mess. One of the reasons why I wanted to do a major version update, try and wrangle that beast.

brendan-duncan commented 5 months ago

I can't believe how many people want to unzip a 4GB+ zip file on an old Android. People want to do weird stuff.

brianquinlan commented 4 months ago

I think that this PR at least partially fixes this issue: https://github.com/brendan-duncan/archive/pull/337