saphyr-rs / saphyr

A set of crates dedicated to parsing YAML.
54 stars 5 forks source link

Consider a monorepo? #10

Closed not-my-profile closed 1 month ago

not-my-profile commented 2 months ago

The code for saphyr is currently split across 3 repos (saphyr, saphyr-parser and saphyr-bench). Serde support is planned to be implemented in yet another crate.

As a contributor to Rust libraries I find monorepos to be easier to work with, you just clone them once and can contribute to everything. Cargo workspaces are great and also mean you don't need to manually set any paths to make your local crate sources use each other, if these crates were to live in a Cargo workspace the paths would already be set up for local development.

I guess the major downsides would be that

Ethiraric commented 1 month ago

I had considered it, and I think my approach is a mistake now. When people talked about monorepos when I was to start saphyr, my brain associated it with "one crate only" and forgot about Cargo workspaces. If I were to do it again, I'd do a monorepo.

For some reason, Github doesn't let me transfer issues properly from one repository to the other, so I just left things as they are. I'd ideally like issue / PR context to remain. I'm more lost in inaction rather than against monorepos.

I think I could @davvid to also get their opinion.

not-my-profile commented 1 month ago

What's the issue with transferring issues between repos? You can archive the repos that have been merged into the monorepo so closed PRs can still be referenced. Open PRs will need to be reopened ... there's git filter magic that can help with that. The commit histories of the other crates can also be preserved when merged into the monorepo (the commits will have new hashes).

davvid commented 1 month ago

In case it helps, I would personally be in favor of merging the repositories together and using cargo workspaces to handle the crates.

It's probably easier to do it now since saphyr-parser only has two open issues so we could do something simple and open corresponding issues on the combined "saphyr" repo referencing these two issues. We can then add a comment on the original saphyr-parser issue mentioning that further discussion has moved to the corresponding saphyr issue before locking the sub-repositories from updates.

We will lose some links/references to older (already closed) saphyr-parser issues, but if that simplifies our collective workflows then it seems like it'd be worth it over the long run.

If you'd like a hand setting this up just let me know and I can plan to spend some cycles on it so that you won't have to get distracted by this task. My plan would be to use something akin to what git subtree does under the hood so that we retain the original git history in each sub-crate as part of the merge.

There's plenty of real-world examples out there that we can use for inspiration. The spk project, for example, has a fairly large cargo workspace setup https://github.com/spkenv/spk/blob/main/Cargo.toml and I don't think ours will be quite as elaborate.

Ethiraric commented 1 month ago

Oh. Yeah. Every # reference is gonna get messed up for a while.

I think I got it right. My brain really wanted an octopus merge for each of the 3 repositories. It was much harder than I thought.

Aside from the new root Cargo.toml, readme and other files we'll have to recreate, does the monorepo branch look okay to you? I think that's what we want to achieve, and I think history is preserved (we now have 3 root commits).

davvid commented 1 month ago

Yup, that's pretty much what I'd expect as a merged starting point. From there it's just cargo tweaks to move stuff into a top-level workspace cargo.toml, unify .gitignores, etc.

davvid commented 1 month ago

I pushed a couple of commits into the monorepo branch to start merging the cargo.toml metadata together. Please pull that down into your local branch.

It doesn't build currently (cargo build --workspace) but I think you'll probably recognize the build errors since they seem related to recent refactoring.

davvid commented 1 month ago

Also, note that parser and bench both have run_bench and time_parse bins. I had to rename the parser ones via its Cargo.toml to be run_parser_bench and time_parser_parse to silence warnings from cargo.

Ethiraric commented 1 month ago

I merged my local changes to yours. I think we're good to merge?

Since this repo is still kind of a big WIP, I'll merge this right away and we'll address issues afterwards.