Closed CaptainDario closed 8 months ago
Those versions are quite old. Likely the slowdown is from reducing the memory used for decoding files, it used to store everything in memory. If you have an example zip, or at least tell me the characteristics of the zip files you're testing, I can profile where the performance hit is coming from. How big is the zip? Does it contain a lot of small files, or few big files? Anything that can help me see what you're seeing.
Thank you for the response! All of the zips are available:
I think they are small to medium sized, 1-500MB. Let me know if I can help somehow else.
The strange thing is that when I profiled extracting those files, almost all of the performance cost is in Uint8List.setRange
, which is used for decompression. It's strange because it should have been also using that function in those older versions you were testing.
Decompressing dictionary.zip took 37 seconds on my computer.
I tried an optimization of setting array values directly instead of calling setRange, it does a for loop to set the bytes. Decompressing dictionary.zip now took 10 seconds, more than a 3x performance improvement.
Sometimes Dart is really frustrating.
I'll push out a release with this "optimization" soon since it seems pretty significant.
The change is in git if you want to test it out before I get to publishing a new version.
I went ahead and published the change as 3.4.7, in case I get really busy again.
Thank you for your effort! Timings are now down to this
flutter: assets/dict/dictionary.zip
flutter: Elapsed: 0:00:04.458394
flutter: assets/dict/examples.zip
flutter: Elapsed: 0:00:08.988620
flutter: assets/dict/krad.zip
flutter: Elapsed: 0:00:00.081222
flutter: assets/dict/radk.zip
flutter: Elapsed: 0:00:00.091467
flutter: assets/ipadic.zip
flutter: Elapsed: 0:00:00.422946
Still the old timings are better, maybe offer two functions, one that stores everything in memory and one that does not? Do you have a rough estimate how much memory is saved?
On a different note, it would be nice if extractArchiveToDisk
also had a callback that tracks the current progress.
I'll have to look back at what the old version was doing. When I did profiling after I added that last optimization, it seemed like most of the time was spent in disk IO writing the output files. I would expect the old version would have been the same. Maybe there was a change to writing the output files to disk that is causing that extra performance loss. Dart is so finicky it only takes one wrong function call or something small to completely kill performance.
I think in your case of using extractArchiveToDisk instead of extractFileToDisk, it wouldn't be a memory vs disk speed issue because the entire archive is already in memory. If you had used extractFileToDisk, then disk/memory would have been more of an issue because in that case it doesn't read the entire archive into memory. So I'm not sure what I was doing in that old version that's causing the additional performance hit. I'll keep poking at it as I can find the time.
I found the missing ingredient. In 3.3.2 it always uses the native dart:io ZLib decompression when running in a dart:io environment like Flutter.
After 3.3.5 or whenever, I spent a lot of time working on memory usage on Flutter environments because people were decompressing 4GB+ zip files on mobile devices...which I thought was crazy, but who am I to judge. In that process the dart:io ZLib decompressor got lost from the code, because that requires the entire buffer in memory to be decompressed.
I added the dart:io ZLib decompressor back into the code and the performance seems to be on par with 3.3.2 now.
I'll get that pushed and published in another update.
Published that update as 3.4.8.
I hope this will be better for you, and thank you for pointing out the performance regression.
Thank you again for working so fast on this! The times are way faster compared to 3.4.6 and this version is actually usable for me. However, still 1.5-3x compared to 3.3.2. This seems to be especially the case with the smaller zips, maybe some initialization overhead as the difference becomes smaller with the larger assets.
v3.4.8 | v3.3.2
flutter: assets/dict/dictionary.zip
flutter: 0:00:01.263228 | 0:00:00.831474
flutter: assets/dict/examples.zip
flutter: 0:00:01.724894 | 0:00:01.312779
flutter: assets/dict/krad.zip
flutter: 0:00:00.349453 | 0:00:00.010236
flutter: assets/dict/radk.zip
flutter: 0:00:00.335912 | 0:00:00.005493
flutter: assets/ipadic.zip
flutter: 0:00:00.475656 | 0:00:00.139063
Is it possible to add a progress callback similar to the one that was introduced in 3.3.8
Add progress callback for decoding zip files.
to the extractArchiveToDisk
function? This would be really useful because especially on older devices the unpacking is quite slow.
It will probably be a while before I can get back to working on this some more.
It's strange you're still seeing performance difference since both versions had the same timing with your zips for me.
A progress callback could be added, but it would be based on the number of files extracted from the zip, and in the zip files you provided it wouldn't be all that useful since they only had one file in the zip. It would be much more difficult to show progress with the decoding within a file.
@brendan-duncan you are right, this was my mistake. I'm so sorry about that! In the initial measurements, I measured only the unpacking time, but with the latest measurements, I measured the whole unpacking code.
These are the values for 3.3.2
vs 3.4.6
flutter: assets/dict/dictionary.zip
flutter: 0:00:00.848170 | 0:00:00.831474
flutter: assets/dict/examples.zip
flutter: 0:00:01.199115 | 0:00:01.312779
flutter: assets/dict/krad.zip
flutter: 0:00:00.011537 | 0:00:00.010236
flutter: assets/dict/radk.zip
flutter: 0:00:00.005639 | 0:00:00.005493
flutter: assets/ipadic.zip
flutter: 0:00:00.128790 | 0:00:00.139063
Thank you for fixing this so fast!
I was using version
3.3.2
of this package and the unpacking of rather large files from assets was quite fast. 2-10s depending on the device (measurements below are from a MacBook with m1 pro). More specifically I unzipped 5 zips and got the following stats for just the unpackingAfter updating to
3.4.6
those times have drastically increased, it takes not between 1-2 minutes.By trying all versions, I think this behavior was introduced in
3.3.5
(3.3.4
and3.3.3
result in broken unpacked files for me, therefore, take the timings of them with a grain of salt but they seem to be still fast)The timings
The timing for `3.3.5` ``` bash flutter: assets/dict/dictionary.zip flutter: Elapsed: 0:00:13.573168 flutter: assets/dict/examples.zip flutter: Elapsed: 0:00:31.238170 flutter: assets/dict/krad.zip flutter: Elapsed: 0:00:00.257237 flutter: assets/dict/radk.zip flutter: Elapsed: 0:00:00.294290 flutter: assets/ipadic.zip flutter: Elapsed: 0:00:00.709858 ``` The timing for `3.3.4` ``` bash flutter: assets/dict/dictionary.zip flutter: Elapsed: 0:00:00.004266 flutter: assets/dict/examples.zip flutter: Elapsed: 0:00:00.000361 flutter: assets/dict/krad.zip flutter: Elapsed: 0:00:00.000329 flutter: assets/dict/radk.zip flutter: Elapsed: 0:00:00.000373 flutter: assets/ipadic.zip flutter: Elapsed: 0:00:00.006861 ``` The timing for `3.3.3` ``` bash flutter: assets/dict/dictionary.zip flutter: Elapsed: 0:00:00.004581 flutter: assets/dict/examples.zip flutter: Elapsed: 0:00:00.000327 flutter: assets/dict/krad.zip flutter: Elapsed: 0:00:00.000290 flutter: assets/dict/radk.zip flutter: Elapsed: 0:00:00.000286 flutter: assets/ipadic.zip flutter: Elapsed: 0:00:00.008189 ```The code to achieve this is