chshersh / tool-sync

🧰 Download pre-built binaries of all your favourite tools with a single command
https://crates.io/crates/tool-sync
Mozilla Public License 2.0
69 stars 16 forks source link

Add checksum check #70

Open chshersh opened 1 year ago

chshersh commented 1 year ago

Adding checksum check (SHA256 or MD5) would enable two features:

The current problem with supporting this feature is that GitHub doesn't provide this information in their API and maintainers usually don't include the checksum either. One example of a package that has checksum is GitHub CLI:

So, first, we need to figure out how to get a reliable source of checksum.


The list of tools that support checksums:

MitchellBerend commented 1 year ago

Is there a reason the tool should not just check all the files in the release for a sha or md sub string? There is a small discussion about requesting the feature here. I think we can just support this feature for repos that do publish a checksum.

I would add an optional option in the tool.toml like this.

# This configuration is automatically generated by tool-sync 0.2.0
# https://github.com/chshersh/tool-sync
#######################################
#
# Installation directory for all the tools:
store_directory = "$HOME/rust/tool-sync/bin"
#
# tool-sync provides native support for some of the tools without the need to
# configure them. Uncomment all the tools you want to install with a single
# 'tool sync' command:
#
# [bat]
# [difftastic]
# [exa]
# [fd]
# [github]
# [ripgrep]
# [tool-sync]
#
# You can configure the installation of any tool by specifying corresponding options:
#
[ripgrep]  # Name of the tool (new or one of the hardcoded to override default settings)
    owner     = "BurntSushi"  # GitHub repository owner
    repo      = "ripgrep"     # GitHub repository name
    exe_name  = "rg"          # Executable name inside the asset

    checksum  = "some_checksum_string"
#     Uncomment to download a specific version or tag.
#     Without this tag latest will be used
#     tag       = "13.0.0"

#     Asset name to download on linux OSes
    asset_name.linux = "musl"

#     Uncomment if you want to install on macOS as well
#     asset_name.macos = "apple-darwin"

#     Uncomment if you want to install on Windows as well
#     asset_name.windows = "x86_64-pc-windows-msvc"
chshersh commented 1 year ago

The checksum field is the way to go 👍🏻 I've checked some common tools. Not many of them provide checksums in their GitHub releases but a few tools to actually provide. And I've added all the tools I've found in the issue body with the description of how they provide the checksum.

Indeed, we can implement this feature by supporting only tools that do provide the checksum. It's not a problem 🙂

Also, I've realised, that checksums actually solve several problems:

  1. Avoid downloading the asset if we already have it installed with the specified checksum (requires persisting checksum somewhere by tool).
  2. Avoid downloading the tool if the provided by the repository checksum doesn't match (prevents supply-chain attacks)
  3. Avoid installing the tool after downloading if the asset checksum doesn't match the specified one (requires tool-sync to output checksum of the tool after successful installation so the users can actually know it and maybe even persist it so users might ask for the checksum of a tool)

These all use cases can be covered with a single checksum field but their flows and UX are very different.

Is there a reason the tool should not just check all the files in the release for a sha or md sub string?

Then, someone creates a tool like md5-quine or sha-collisions and we're screwed 😅 Also, gh for example stores all the checksums in a single checksum.txt file and doesn't even mention sha or md in its name 😕

MitchellBerend commented 1 year ago

Okay I suppose we should implement this feature then, since for me the file integrity is enough of a reason for me already.


Then, someone creates a tool like md5-quine or sha-collisions and we're screwed sweat_smile

We can solve this with regex if we assume the pattern looks something like \.[a-z]+\d{1,3}$. This would match md5 sha224 sha384 etc. The approach like you proposed in #50 would also work here I think.

Also, gh for example stores all the checksums in a single checksum.txt file and doesn't even mention sha or md in its name

We could download these types of files first to see if the checksum of the current binary matches any of those, but this is probably out of scope for this feature.