balena-io / etcher

Flash OS images to SD cards & USB drives, safely and easily.
https://etcher.io/
Apache License 2.0
29.71k stars 2.11k forks source link

7-zip support #711

Closed ThomasKaiser closed 6 years ago

ThomasKaiser commented 8 years ago

I looked through documentation, issues and finally code and could not find support for .7z archives. Is there a specific reason why it's missing (7-zip being LGPL)?

jviotti commented 8 years ago

We've looked into it a while back, but we couldn't find any decent 7zip NodeJS modules we could use (all of them either depend on the 7zip tool being available on the path, or embed the binary inside the package).

Given that 7zip is not a common compression method for IoT images (the main focus of Etcher), working on our own decompressor to include to support for it has not being a priority.

I'm happy to give it another go if you think there's a library worth checking (please re-open if so).

On Wed, Sep 21, 2016 at 01:19:25AM -0700, Thomas Kaiser wrote:

  • Etcher version: 1.0.0-beta14
  • Operating system and architecture: OS X 10.10

I looked through documentation, issues and finally code and could not find support for .7z archives. Is there a specific reason why it's missing (7-zip being LGPL)?

You are receiving this because you are subscribed to this thread. Reply to this email directly or view it on GitHub: https://github.com/resin-io/etcher/issues/711

Juan Cruz Viotti Software Engineer

ThomasKaiser commented 8 years ago

Well, we (Armbian) switched away from .zip to .7z a while ago and I already encouraged other distro bakers to do so as well. Am exploring other opportunities now first (.bz2not working on OS X with a first attempt). I'll investigate myself first, get some experiences and we discuss this internally. Anyway thanks for this great piece of software even if the package size seems to scare some users :)

jviotti commented 8 years ago

Hi @ThomasKaiser ,

That's interesting. What were the reasons for switching to .7z?

I'll investigate myself first, get some experiences and we discuss this internally. Anyway thanks for this great piece of software even if the package size seems to scare some users :)

Haha, sounds good! Hopefully Electron finds a way to make applications smaller!

ThomasKaiser commented 8 years ago

That's interesting. What were the reasons for switching to .7z?

More or less only image size. LZMA with maximum compression applied to a clean image (made from scratch) is way more effizient than .zip: https://github.com/igorpecovnik/lib/issues/209

In the meantime we only use -mx=3 instead of -mx=9 since we build +100 OS images with every update, size differences are only small but time savings huge. Only drawback (at least for me using OS X): decompression happens on the CPU cores and not on the GPU as with .zip (OpenCL/GCD). But I'm not the average user :)

BTW: Regarding application size: Did you try to apply HFS+ transparent file compression to your application bundle already using afsctool?

jviotti commented 8 years ago

More or less only image size. LZMA with maximum compression applied to a clean image (made from scratch) is way more effizient than .zip: https://github.com/igorpecovnik/lib/issues/209

What about .xz? It also uses the LZMA compression algorithm and its more well-known to image consumers, plus we have support for it in Etcher already.

BTW: Regarding application size: Did you try to apply HFS+ transparent file compression to your application bundle already using afsctool?

Very nice, thank you very much for the suggestion. The savings are small (~10MB), but its definitely an improvement. I'll add it to the build scripts very soon.

On Thu, Sep 22, 2016 at 10:19:54AM -0700, Thomas Kaiser wrote:

That's interesting. What were the reasons for switching to .7z?

More or less only image size. LZMA with maximum compression applied to a clean image (made from scratch) is way more effizient than .zip: https://github.com/igorpecovnik/lib/issues/209

In the meantime we only use -mx=3 instead of -mx=9 since we build +100 OS images with every update, size differences are only small but time savings huge. Only drawback (at least for me using OS X): decompression happens on the CPU cores and not on the GPU as with .zip (OpenCL/GCD). But I'm not the average user :)

BTW: Regarding application size: Did you try to apply HFS+ transparent file compression to your application bundle already using afsctool?

You are receiving this because you were assigned. Reply to this email directly or view it on GitHub: https://github.com/resin-io/etcher/issues/711#issuecomment-248968941

Juan Cruz Viotti Software Engineer

jviotti commented 8 years ago

Hi @ThomasKaiser ,

I'm not sure if you're aware of it, but Etcher has the concept of "Extended Archives" which consists basically in a ZIP file containing your image, plus some extra metadata that allows the image to be more tightly integrated into Etcher and support extra functionalities (like bmaps, logos, etc), while still allowing users to flash it as a normal image outside Etcher.

If you're interested, I'm happy to provide more details on how to build such an extended archive via email!

dlech commented 8 years ago

Well, I'm not Thomas Kaiser, but I am interested in building extended archives. You can find my email in my github profile.

ThomasKaiser commented 8 years ago

Hi @jviotti,

I've read through this document already with interest. Will think a bit about it and then maybe get back to you with some feedback (the context was different: creating device backups that contain metadata. And I really hate re-inventing the wheel so I read through your spec with interest)

Will do later some investigations around .xz (in case the compressor is multi-threaded and compression results are on par with 7-zip then it's really worth a look).

And it's strange that you get only 10MB savings since the compressed disk image containing Etcher is just ~75MB in size, isn't it? Will check that also later.

alexandrosm commented 8 years ago

Hey @ThomasKaiser, from some basic research, xz looks like a slightly improved version of 7z, with multi-threading support as well. It uses the same compression algorithm, so the file size results should be the same. At the same time, we've discovered a promising 7zip library that we'll try as well.

We'd love to hear more about the "device backups with metadata" use case too, as backups are on the etcher roadmap, if a bit down the line.

By the way, we discuss a lot about etcher and new features in gitter, feel free to drop in (cc @dlech )

pfeerick commented 8 years ago

DietPi also uses 7zip for their images, so that project would also benefit from 7zip support, although I suspect they'd also be just as happy to move to xz if it was found to be more efficient ;) I'd also like to thank you guys for the great work you've done on Etcher... it certainly makes supporting users on multiple OSes a lot easier! đź‘Ť

jviotti commented 8 years ago

Thanks a lot for the kind words @pfeerick !

@alexandrosm found the following 7z NodeJS module: https://github.com/jalcaldea/7z-stream

Happy to re-open this issue if we find the module above does what we need.

ThomasKaiser commented 8 years ago

We'd love to hear more about the "device backups with metadata" use case too, as backups are on the etcher roadmap, if a bit down the line.

From time to time I do some research on this as it's something Armbian users are constantly asking for. Last occasion was this. Doing SD card clones is easy (and save, a running OS can not be cloned 100 percent consitently without taking special precautions) but people struggle with eMMC equipped board.

On Fast Ethernet equipped boards the best mode would be a special mini distro they can boot with that hands out the eMMC in USB mass storage gadget mode (explained here to prepare correctly sized DD images), on GbE equipped boards IMO the best variant is to send the eMMC's contents over network to a beefy box where it gets compressed there (in fact I do this for quick cloning, using dd on the devices I test with, send the stream through netcat and create an archive at the target location (if it's OS X it's done mostly on the GPU cores using ditto -c -k --sequesterRsrc --keepParent)

But we're searching for a way to clone to network shares or external disks too. Problem is that SBC are pretty slow when it's about compression and that most tools fail to compress multi-threaded when input is stdin (pipe from dd). One approach would be the use pbzip2, another to split up the whole operation in n tasks and let grep -c "^processor" /proc/cpuinfo of them be processed in parallel.

So after installing lz4-tools one could check the size of the eMMC (/proc/partitions), then split it up into eg 48 chunks and let 4 of them be processes in parallel with single dd/lz4 pipes. But then you end up with 48 files that need to be processed in exactly the same order to restore the eMMC contents. And that's where some sort of a container would be a good idea. So basically it's a different approach but if we start with something like that we don't want to re-invent the wheel and try to use something that is already developed/established if it fits our needs.

Regarding .7z: I will follow your advices and look into xz a bit more and do some tests regarding speed and compression ratio. It's the same as 7-zip then we will most likely switch to .xz instead.

jviotti commented 8 years ago

Incredibly interesting stuff @ThomasKaiser !

Given the increasing amount of interest in 7zip, I'll keep this open, and see if we can actually do something about it (regardless on if you switch to xz or not).

ThomasKaiser commented 8 years ago

Just a small note on the 'backup and/or flash eMMC' problem: within 24 hours the problem is solved for Allwinner devices (am a bit curious why none of the ultra cheap and powerful H3 devices is supported by resin.io): http://forum.armbian.com/index.php/topic/2125-armbian-for-orange-pi-does-not-boot/?p=16455

Now it's just connecting the H3 device's Micro USB port to the USB port of a Windows, Linux or OS X host, then start a script, the device boots a kernel and turns itself into an USB Mass Storage gadget within half a minute and can then both be backed up or directly flashed with Etcher. I will build an application bundle the next days for OS X and keep you guys updated (but of course not that useful for you since you don't seem to support any Allwinner device now)

Absolutely weird that all the great and cheap H3 devices are missing here :)

ThomasKaiser commented 8 years ago

And .zst is something we should keep an eye on: https://code.facebook.com/posts/1658392934479273/smaller-and-faster-data-compression-with-zstandard/

alexandrosm commented 8 years ago

Wow, really nice and exciting!

Alexandros Marinos

Founder & CEO, Resin.io

+1 206-637-5498

@alexandrosm

On Tue, Oct 4, 2016 at 12:56 AM, Thomas Kaiser notifications@github.com wrote:

And .zst is something we should keep an eye on: https://code.facebook.com/ posts/1658392934479273/smaller-and-faster-data-compression-with-zstandard/

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/resin-io/etcher/issues/711#issuecomment-251319513, or mute the thread https://github.com/notifications/unsubscribe-auth/ABLUCImQB9aZhJE7lCKUky0M-9CXAi__ks5qwgaqgaJpZM4KChYy .

skhameneh commented 7 years ago

@ThomasKaiser @alexandrosm there's also brotli (see https://www.opencpu.org/posts/brotli-benchmarks/). Although the article is old, it shows brotli beating xz; and Facebook's page shows xz beating zstd.

For node you've got the "brotli" and "iltorb" modules. "brotli" is "compiled" to LLVM and transpiled to JavaScript using Emscripten; the performance hit isn't horrible, but it's noticeable (I haven't benchmarked, but it should be anywhere from a .5x to 10x hit). "iltorb" uses compiled binaries and happens to have the latest sources compiled.

The only downside to brotli is no file tree support; it's single file only.

ThomasKaiser commented 7 years ago

Some more info on zstandard (see comments for some tests related to SBC world)

skhameneh commented 7 years ago

@ThomasKaiser so you didn't look into brotli, I take it? I am well aware of zstandard tests and benchmarks. :)

lurch commented 7 years ago

@skhameneh The article you link to about brotli says "in particular small text documents", which makes me suspect that it wouldn't be that useful for Etcher, where we're dealing with very large binary files?

lurch commented 7 years ago

@ThomasKaiser Any more thoughts about changing Armbian from .7z to .xz ?

ThomasKaiser commented 7 years ago

Any more thoughts about changing Armbian from .7z to .xz ?

@lurch I suggested it a while ago but got overruled by other devs for now. In the meantime I found burning .xz compressed images way slower than 'decompress first and burn uncompressed' (disclaimer: SD cards in my MacBook Pro are written/read with ~80MB/s and I use a few SanDisk Extreme Pro/Plus only for this purpose) so I didn't look into it for now.

Additional problem: I prepared reporting that back to you (and did some measurements/profiling in macOS but MacBook crashed and numbers gone).

But I'm still trying to get best integration of Armbian image creation with Etcher since it's simply the best tool out there.

BTW: Removing display of CRC32 at the end of burning with Etcher 1.0 is a bit sad since we've to deal with users that think Etcher would be unnecessary 'bloatware', refuse to use it but claim the opposite. In the past the simple question to show Etcher's CRC32 value resolved this, people used Etcher and understood that their SD card sucked.

lurch commented 7 years ago

That's a shame, but thanks very much for trying @ThomasKaiser :+1:

Regarding the CRC32 - ironically we removed it for exactly the opposite reasons - it was confusing newbie users ;-) #993

users that think Etcher would be unnecessary 'bloatware'

Well, you can't please everyone all of the time.

lurch commented 6 years ago

Just looking through old issues, and the comment above about

Now it's just connecting the H3 device's Micro USB port to the USB port of a Windows, Linux or OS X host, then start a script, the device boots a kernel and turns itself into an USB Mass Storage gadget within half a minute and can then both be backed up or directly flashed with Etcher.

sounds very similar to the usbboot stuff we added to Etcher to support the Raspberry Pi Compute Modules :grinning: (but it looks like quite a different protocol )

jhermsmeier commented 6 years ago

That is actually the same mechanism the CHIP board uses for flashing (FEL / fastboot) – good to know.

ThomasKaiser commented 6 years ago

FEL / fastboot

FEL is one thing, fastboot another. All Allwinner SoCs support this, in Armbian we provided the stuff only for H3 so far but Banana Pi folks based on our work and extended it for a few more Allwinner SoCs: https://forum.armbian.com/topic/2454-fel-mass-storage-or-writing-images-directly-to-emmc/?do=findComment&comment=42145

yegorich commented 6 years ago

If 7-zip support is going to be added, it would be also great, if self-extracting files (exe) would also be supported. Thanks.

jviotti commented 6 years ago

Lets close this for now, as it turns out to be very difficult to find 7-zip implementations we can use.

jdmarshall commented 4 years ago

@ThomasKaiser You've created a roadblock for people who are new to the platform.

pfeerick commented 4 years ago

@ThomasKaiser You've created a roadblock for people who are new to the platform.

@jdmarshall Excuse me? How has asking for 7z support to be added created a roadblock? Or do you perhaps mean the choice of the Armbian devs to use 7z as their compression method (as opposed to say xz) added a extra step for new users (having to decompress the 7z file so the image can be written with Etcher).

jdmarshall commented 4 years ago

@jdmarshall Excuse me? How has asking for 7z support to be added created a roadblock? Or do you perhaps mean the choice of the Armbian devs to use 7z as their compression method (as opposed to say xz) added a extra step for new users (having to decompress the 7z file so the image can be written with Etcher).

The latter. I get that new formats don’t get established if nobody uses them but maybe pick one your complements (is, Etcher) can already handle. Last time I did Pi stuff I ran into .rar files. Are you kidding me? I haven’t seen a RAR file in ten years. If memory serves I had to use a trial version of some OS X payware to extract them. I just want to play with the Pi, not get sucked into The Compression Wars (and I say that as someone who used to hang out on comp.compression)

It sounds like Armbian folks believe that tarballs can’t be handled by Etcher. But I don’t think https://github.com/balena-io/etcher/blob/bdf920f86459d4ab63e62bd3b380f6c2d336ceac/docs/EXTENDED-ARCHIVES.md says that, does it? Is this just a miscommunication?

pfeerick commented 4 years ago

True, since Etcher turned up on the scene (and Armbian was already established), things have been less than ideal. However, I for one don't expect Armbian to change just to suit Etcher users, since it's not that hard for you to download 7zip (free, open source, and supports ZIP, RAR, 7z, and many other compression formats) if you don't have it... and if that is hard... I really don't think I would want to know what other problems you'd likely have.

The Extended Archives document is specifically about ZIP files, not tarballs (which actually don't appear to be supported), and is old documentation from 2016 that has been subsequently removed. In point of fact, I don't know how you're supposed to know what image format balenaEtcher supports outside of some random webpage tell you, looking through the source code, or download it just to see the tooltip popup (img, iso, zip, bin, bz2, dmg, dsk, etch, gz, hddimg, raw, rpi-sdimg, sdcard, wic, xz at present)... surprising omission from the FAQs! :-O

I do wonder if this is worth revisiting, as there are projects like https://www.npmjs.com/package/node-7z are looking pretty popular with some 2k+ downloads per week, and whilst it is still a wrapper for the binaries, it looks to be self-contained. Doing so would also open up RAR, XZ, Z, CAB, and other formats 7z natively supports.

lurch commented 4 years ago

whilst it is still a wrapper for the binaries, it looks to be self-contained.

I'm not an Etcher developer, but don't forget that it needs to run on 32-bit Windows, 64-bit Windows, 32-bit Linux, 64-bit Linux and macOS. https://github.com/balena-io/etcher/releases

pfeerick commented 4 years ago

Yup, which is why I suggested that package, since it can use a system installed 7zip (nice way to cheat) or you include the 7zip-bin npm package that is recommended, which provides binaries for Windows (32/64), Mac, and Linux (32/64/arm/arm64)... so anywhere Etcher runs AFAIK.