chshersh / tool-sync

🧰 Download pre-built binaries of all your favourite tools with a single command
https://crates.io/crates/tool-sync
Mozilla Public License 2.0
72 stars 17 forks source link

More elegant definition of hardcoded tools #84

Closed chshersh closed 2 years ago

chshersh commented 2 years ago

Currently, when a new tool inserted in a hardcoded database, it's written like this:

    tools.insert(
        "bat".into(),
        ToolInfo {
            owner: "sharkdp".to_string(),
            repo: "bat".to_string(),
            exe_name: "bat".to_string(),
            asset_name: AssetName {
                linux: Some("x86_64-unknown-linux-musl".to_string()),
                macos: Some("x86_64-apple-darwin".to_string()),
                windows: Some("x86_64-pc-windows-msvc".to_string()),
            },
            tag: ToolInfoTag::Latest,
        },
    );

This is quite verbose. I'd love to be able to use just plain strings here like this:

    tools.insert(
        "bat",
        ToolInfo {
            owner: "sharkdp",
            repo: "bat",
            exe_name: "bat",
            tag: ToolInfoTag::Latest,
            asset_name: AssetName {
                linux: Some("x86_64-unknown-linux-musl"),
                macos: Some("x86_64-apple-darwin"),
                windows: Some("x86_64-pc-windows-msvc"),
            },
        },
    );

I tried doing something like this in the past but the code became quite complicated due to the management of all the lifetimes in the type signatures. But maybe there's a good middle-ground, so I want to try again πŸ€”

MitchellBerend commented 2 years ago

I think it is possible to have these strings as &'static str so there is no need for the lifetime mess. This would mean other parts of the codebase would also have to change to accommodate this. Im not sure how serde deals with String vs &str.


// from this
pub struct ToolInfo {
    /// GitHub repository author
    pub owner: String,

    /// GitHub repository name
    pub repo: String,

    /// Executable name inside the .tar.gz or .zip archive
    pub exe_name: String,

    /// Version tag
    pub tag: ToolInfoTag,

    /// Asset name depending on the OS
    pub asset_name: AssetName,
}

// to this
pub struct ToolInfo {
    /// GitHub repository author
    pub owner: &'static str,

    /// GitHub repository name
    pub repo: &'static str,

    /// Executable name inside the .tar.gz or .zip archive
    pub exe_name: &'static str,

    /// Version tag
    pub tag: ToolInfoTag,

    /// Asset name depending on the OS
    pub asset_name: AssetName,
}```
chshersh commented 2 years ago

I'm not sure &'static str will work here. ToolInfo is created from ConfigAsset in sync/conflgure.rs so it can't be statically known:

https://github.com/chshersh/tool-sync/blob/52a38037e21bf539eb11dfec3dfa643b2b8a691c/src/sync/configure.rs#L30-L52

The ToolInfo definition probably should be changed to:

pub struct ToolInfo<'a>

And then build_db() could return:

-pub fn build_db() -> BTreeMap<String, ToolInfo> {
+pub fn build_db() -> BTreeMap<&'static str, ToolInfo<'static>> {

This may require performing this kind of <'a> refactoring to multiple places. Maybe we can be smart about it and not use &'a str everywhere. I tried this before and I didn't like the code πŸ’©


An alternative solution would be to implement a macro so we could define a tool like this (but I haven't written any macro before so it would require some investigation):


tools.insert(tool_info!(
    owner: "sharkdp",
    repo: "bat",
    exe_name: "bat",
    tag: ToolInfoTag::Latest,
    linux: "x86_64-unknown-linux-musl",
    macos: "x86_64-apple-darwin",
    windows: "x86_64-pc-windows-msvc",
));
MitchellBerend commented 2 years ago

A macro is a good idea actually I can look into that. Would it be nice to have the macro take a list of ToolInfo definitions or just a single one so it would have to be called multiple times?

chshersh commented 2 years ago

Let's keep the macro simple. It should take a single one and be called multiple times. I'm even okay with keeping tools.insert outside the macro just to keep code more understandable.

MitchellBerend commented 2 years ago

@chshersh I took a stab at implementing this macro (its a monster). I believe this syntax looks a lot like javascript with the anonymous objects in there. This can obviously be changed.

macro_rules! insert_tool_into {
    (
        $tools:expr,
        $tool_name:expr,
        {owner: $owner:expr,
        repo: $repo:expr,
        exe_name: $exe_name:expr,
        asset_name: {
            linux: $linux:expr,
            macos: $macos:expr,
            windows: $windows:expr,
        },
        tag: $tag:expr}
    ) => {
        $tools.insert(
            $tool_name.to_string(),
            ToolInfo {
                owner: $owner.to_string(),
                repo: $repo.to_string(),
                exe_name: $exe_name.to_string(),
                asset_name: AssetName {
                    linux: Some($linux.to_string()),
                    macos: Some($macos.to_string()),
                    windows: Some($windows.to_string()),
                },
                tag: $tag
            }
        )
    }
}

pub fn build_db() -> BTreeMap<String, ToolInfo> {
    let mut tools: BTreeMap<String, ToolInfo> = BTreeMap::new();
      insert_tool_into!(
              tools,
              "github",
              {
                  owner: "cli",
                  repo: "cli",
                  exe_name: "gh",
                  asset_name: {
                      linux: "linux_amd64.tar.gz",
                      macos: "macOS_amd64",
                      windows: "windows_amd64.zip",
                  },
                  tag: ToolInfoTag::Latest
              }
          );
chshersh commented 2 years ago

@MitchellBerend Great work on the macro! πŸ‘πŸ» It actually looks neat 😎

But after thinking about this issue for a while, I realised that we probably don't even need macro at all! Maybe this is what were trying to say in your previous comment. But I think, it's better to go with a simpler option.

We can create a new data type called StaticToolInfo that would store all the information as &'static str:

pub struct StaticToolInfo {
    /// Tool name
    pub name: &'static str,

    /// GitHub repository author
    pub owner: &'static str,

    /// GitHub repository name
    pub repo: &'static str,

    /// Executable name inside the .tar.gz or .zip archive
    pub exe_name: &'static str,

    /// Version tag
    pub tag: ToolInfoTag,

    pub linux: &'static str,
    pub macos: &'static str,
    pub windows: &'static str,
}

And then we can just write:

let mut db = BTreeMap::new();

let tools = [
    StaticToolInfo {
        name: "bat",
        owner: "sharkdp",
        repo: "bat",
        exe_name: "bat",
        tag: ToolInfoTag::Latest,
        linux: "x86_64-unknown-linux-musl",
        macos: "x86_64-apple-darwin",
        windows: "x86_64-pc-windows-msvc",
    },
    ...
];

for tool in tools {
    db.insert(tool.into())
}

db

So no macros needed πŸ™‚

I haven't learned macros yet so I'd like to avoid the feature I can't maintain. I'm also afraid they can affect compilation times (especially once we are going to add more tools). So I'd like to rely on them only if there's no other way. And I think we can get a way with a simple record here.

MitchellBerend commented 2 years ago

I haven't learned macros yet so I'd like to avoid the feature I can't maintain.

Fair assessment, this macro looks hard to maintain for me too. It looks very complicated and macro syntax in general can get very unrustlike imo.

Im guessing there has to be a From implementation. How are Nones handled here?

    pub linux: &'static str,
    pub macos: &'static str,
    pub windows: &'static str,

Edit: formatting

chshersh commented 2 years ago

Im guessing there has to be a From implementation.

Yes, that sounds good to me πŸ™‚

How are Nones handled here?

I imagine something like this:

const NOT_SUPPORTED: &'static str = "NOT_SUPPORTED";

fn to_supported_asset(asset_name: &str) -> Option<String> {
    if asset_name == NOT_SUPPORTED then { None } else { Some(asset_name.to_owned()) }
}

And in the code it'll be like:

    StaticToolInfo {
        name: "bat",
        owner: "sharkdp",
        repo: "bat",
        exe_name: "bat",
        tag: ToolInfoTag::Latest,
        linux: "x86_64-unknown-linux-musl",
        macos: "x86_64-apple-darwin",
        windows: NOT_SUPPORTED,
    }

It looks like but I don't expect this string to be inside any asset name. We can make it more random to minimise the chances of collision. But anyway, this is not a problem because this is for hardcoded tools. And we control that. So we can always change if some new tool somehow manages to match with this string.

MitchellBerend commented 2 years ago

We can make it more random to minimise the chances of collision.

Does github support unicode in binary names? We could just append a :-1: if it does not.

The constant seems like a good choice here, to me its very clear semantically.

chshersh commented 2 years ago

We could just append a πŸ‘ŽπŸ» if it does not.

Now we're talking πŸ˜… But I don't know the answer to this question πŸ€”

MitchellBerend commented 2 years ago

Since unicode is supported in git tags (I think) they can also show up in the release name right? Are there any banned characters or maybe some other affix?

chshersh commented 2 years ago

I think we shouldn't worry too much about banned characters. As I mentioned, this string is only used in our hardcoded database. So when we find a tool that doesn't work, we can adjust the string to not match anything new. So it doesn't even affect user's configuration.

I even believe that we can use the "NOT_SUPPORTED" string and probably never ever change it πŸ™‚ It's not too random but I still can't imagine an asset having this as a substring.

MitchellBerend commented 2 years ago

Implementing this still leaves some .into() calls in the insert. The standard lib says not to implement Into directly but in my mind it does make more sense to implement Into.

// example
    tools.insert(
        "github".into(),
        StaticToolInfo {
            owner: "cli",
            repo: "cli",
            exe_name: "gh",
            linux: "linux_amd64.tar.gz",
            macos: "macOS_amd64",
            windows: "windows_amd64.zip",
            tag: ToolInfoTag::Latest,
        }.into(),
    );

// implementation options
/// option 1
impl Into<ToolInfo> for StaticToolInfo {
    fn into(self) -> ToolInfo {
        ToolInfo {
            owner: self.owner.to_string(),
            repo: self.repo.to_string(),
            exe_name: self.exe_name.to_string(),
            tag: self.tag,
            asset_name: AssetName {
                linux: to_supported_asset(self.linux.to_string()),
                macos: to_supported_asset(self.macos.to_string()),
                windows: to_supported_asset(self.windows.to_string()),
            },
        }
    }
}

#[inline]
fn to_supported_asset(asset_name: String) -> Option<String> {
    if asset_name == NOT_SUPPORTED { None } else { Some(asset_name) }
}

/// option 2
impl From<StaticToolInfo> for ToolInfo {
    fn from(static_tool_info: StaticToolInfo) -> Self {
        Self {
            owner: static_tool_info.owner.to_string(),
            repo: static_tool_info.repo.to_string(),
            exe_name: static_tool_info.exe_name.to_string(),
            asset_name: AssetName {
                linux: from_supported_asset(static_tool_info.linux.to_string()),
                macos: from_supported_asset(static_tool_info.macos.to_string()),
                windows: from_supported_asset(static_tool_info.windows.to_string()),
            },
            tag: static_tool_info.tag,
        }
    }
}

#[inline]
fn from_supported_asset(asset_name: String) -> Option<String> {
    if asset_name == NOT_SUPPORTED { None } else { Some(asset_name) }
}
chshersh commented 2 years ago

@MitchellBerend It looks much nicer with this new approach! I like it πŸ™‚

Let's implement From to follow idiomatic Rust. I actually don't see any difference so From is fine for me πŸ‘πŸ»

Also, I think you can avoid "github".into() if you change the type of key in the BTreeMap database from String to &'static str