nika-begiashvili / libarchivejs

Archive library for browsers
MIT License
288 stars 35 forks source link
7zip archive archiver archives browser bzip2 create decompression extract gzip lzma lzma2 node-js nodejs rar tar wasm webassembly zip

Libarchivejs

npm version license

Overview

Libarchivejs is a archive tool for browser and nodejs which can extract and create various types of compression, it's a port of libarchive to WebAssembly and javascript wrapper to make it easier to use. Since it runs on WebAssembly performance should be near native. Supported formats: ZIP, 7-Zip, RAR v4, RAR v5, TAR .etc, Supported compression: GZIP, DEFLATE, BZIP2, LZMA .etc

Version 2.0 highlights!

How to use

Install with npm i libarchive.js and use it as a ES module.

The library consists of two parts: ES module and webworker bundle, ES module part is your interface to talk to library, use it like any other module. The webworker bundle lives in the libarchive.js/dist folder so you need to make sure that it is available in your public folder since it will not get bundled if you're using bundler (it's all bundled up already) and specify correct path to Archive.init() method

if libarchive.js file is in the same directory as bundle file than you don't need to call Archive.init() at all

import {Archive} from 'libarchive.js/main.js';

Archive.init({
    workerUrl: 'libarchive.js/dist/worker-bundle.js'
});

document.getElementById('file').addEventListener('change', async (e) => {
    const file = e.currentTarget.files[0];

    const archive = await Archive.open(file);
    let obj = await archive.extractFiles();

    console.log(obj);
});

// outputs
{
    ".gitignore": {File},
    "addon": {
        "addon.py": {File},
        "addon.xml": {File}
    },
    "README.md": {File}
}

More options

To get file listing without actually decompressing archive, use one of these methods

    await archive.getFilesObject();
    // outputs
    {
        ".gitignore": {CompressedFile},
        "addon": {
            "addon.py": {CompressedFile},
            "addon.xml": {CompressedFile}
        },
        "README.md": {CompressedFile}
    }

    await archive.getFilesArray();
    // outputs
    [
        {file: {CompressedFile}, path: ""},
        {file: {CompressedFile},   path: "addon/"},
        {file: {CompressedFile},  path: "addon/"},
        {file: {CompressedFile},  path: ""}
    ]

If these methods get called after archive.extractFiles(); they will contain actual files as well.

Decompression might take a while for larger files. To track each file as it gets extracted, archive.extractFiles accepts callback

    archive.extractFiles((entry) => { // { file: {File}, path: {String} }
        console.log(entry);
    });

Extract single file from archive

To extract a single file from the archive you can use the extract() method on the returned CompressedFile.

    const filesObj = await archive.getFilesObject();
    const file = await filesObj['.gitignore'].extract();

Check for encrypted data

    const archive = await Archive.open(file);
    await archive.hasEncryptedData();
    // true - yes
    // false - no
    // null - can not be determined

Extract encrypted archive

    const archive = await Archive.open(file);
    await archive.usePassword("password");
    let obj = await archive.extractFiles();

Create new archive

Note: pathname is optional in browser but required in NodeJS

    const archiveFile = await Archive.write({
        files: [
            { file: file, pathname: 'folder/file.zip' }
        ],
        outputFileName: "test.tar.gz",
        compression: ArchiveCompression.GZIP,
        format: ArchiveFormat.USTAR,
        passphrase: null,
    });

Use it in NodeJS

    import { Archive, ArchiveCompression, ArchiveFormat } from "libarchivejs/dist/libarchive-node.mjs";

    let buffer = fs.readFileSync("test/files/archives/README.md");
    let blob = new Blob([buffer]);

    const archiveFile = await Archive.write({
      files: [{ 
        file: blob,
        pathname: "README.md",
      }],
      outputFileName: "test.tar.gz",
      compression: ArchiveCompression.GZIP,
      format: ArchiveFormat.USTAR,
      passphrase: null,
    });

How it works

Libarchivejs is a port of the popular libarchive C library to WASM. Since WASM runs in the current thread, the library uses WebWorkers for heavy lifting. The ES Module (Archive class) is just a client for WebWorker. It's tiny and doesn't take up much space.

Only when you actually open archive file will the web worker be spawned and WASM module will be downloaded. Each Archive.open call corresponds to each WebWorker.

After calling an extractFiles worker, it will be terminated to free up memory. The client will still work with cached data.