SerenityOS / serenity

The Serenity Operating System 🐞
https://serenityos.org
BSD 2-Clause "Simplified" License
30.52k stars 3.18k forks source link

Meta: Remove build time dependency on unzip and tar #9866

Open ADKaster opened 3 years ago

ADKaster commented 3 years ago

There are two ways I see to do this:

1) "${CMAKE_COMMAND}" tar https://cmake.org/cmake/help/latest/manual/cmake.1.html#run-a-command-line-tool 2) file(EXTRACT_ARCHIVE) https://cmake.org/cmake/help/latest/command/file.html?highlight=file#archive-extract

This affects the way we download the following data sets at configure time:

After these datasets are extracted using one of the above methods, we can remove at least unzip from the build requirements. tar is still currently required for the Toolchain builds, but there's a chance we could do those using CMake too :eyes:

friendlyanon commented 3 years ago

CMake ships with libarchive and liblzma, so the tar utility is quite powerful both for compressing and for extracting archives.

tuftedocelot commented 2 years ago

FWIW, I was trying out ARCHIVE_EXTRACT and it doesn't seem to like gzip files: CMake Error: Problem with archive_read_open_file(): Unrecognized archive format. This was tested on OpenBSD and Arch Linux

cmake_minimum_required(VERSION 3.18)
project(Whatever)
set(PCI_IDS_GZ_URL https://pci-ids.ucw.cz/v2.2/pci.ids.gz)
set(PCI_IDS_GZ_PATH ${CMAKE_BINARY_DIR}/pci.ids.gz)
set(PCI_IDS_PATH ${CMAKE_BINARY_DIR}/pci.ids)
set(PCI_IDS_INSTALL_PATH ${CMAKE_INSTALL_DATAROOTDIR}/pci.ids)

file(DOWNLOAD ${PCI_IDS_GZ_URL} ${PCI_IDS_GZ_PATH} INACTIVITY_TIMEOUT 10)
file(ARCHIVE_EXTRACT INPUT ${PCI_IDS_GZ_PATH})
BertalanD commented 2 years ago

This situation is really strange, and is probably an oversight on CMake's part.

file(ARCHIVE_EXTRACT) expects the file to be in an archive format, i.e. one that can store multiple files (e.g. tar, zip or cpio). This means that while you can extract .tar.gz files, you can't extract a plain .gz file.

On the other hand, you can create plain .gz files with file(ARCHIVE_CREATE FORMAT raw COMPRESSION GZip). This totally works:

cmake_minimum_required(VERSION 3.18)
project(Compress)

file(ARCHIVE_CREATE PATHS test.txt FORMAT raw COMPRESSION GZip OUTPUT test.txt.gz)

We should probably file an issue on their GitLab.

diegoiast commented 2 years ago

I am unsure if closing this issue is correct. We still depend on these external tools on cmake 3.16 (we conditionally test for this).

BenWiederhake commented 12 months ago

Maybe it would be nice to use our own tools? After all, we have our own tar implementation, so we could build that (without Unicode support), then use it to unpack the Unicode tarballs, and build the "final" tar executable.

(Sorry for necro-posting)

diegoiast commented 11 months ago

can we build tar without unicode support?

ADKaster commented 11 months ago

We could build our own archiving tools without Unicode support to extract the required files. However, I'm not sure on the technical effort there vs the return. More host tools adds complexity and slows down rebuilds. And also hurts incremental builds. If we use our own tar to extract files, any change to e.g. LibCompress will also trigger a re-extract of Unicode and locale data, causing way more targets to need rebuilt than strictly necessary.

Right now, we've already bumped the ladybird minimum required past 3.16. Building the OS itself requires 3.25 so we can use our upstreamed CMake Platform files. The short putt here is to drop CMake 3.16 and go up to 3.23 or 3.25 globally. Especially since Meta/serenity.sh will build CMake from source if the one in your PATH is too old.

I've also got the experimental gn build going, which defers extraction to a few python files that use the builtin Python extractors.