hedronvision / bazel-compile-commands-extractor

Goal: Enable awesome tooling for Bazel users of the C language family.
Other
696 stars 112 forks source link

Added rudimentary support for bzlmod #146

Closed Attempt3035 closed 1 year ago

Attempt3035 commented 1 year ago

This commit is my rudimentary application of the Bazel docs to implement support for the bzlmod system. It should work alongside the existing system, but provide FAR easier access for people using bzlmod to add this repo's functionality to their project. It is as simple as adding the following to a MODULE.bazel file:

bazel_dep(name = "hedron_compile_commands")
git_override(
    module_name = "hedron_compile_commands",
    remote = "https://github.com/hedronvision/bazel-compile-commands-extractor",
    commit = "THIS-COMMIT-HASH-OR-NEWER",
    patches = [],
    patch_cmds = [],
    patch_strip = 0,
)

Resolves #54

cpsauer commented 1 year ago

Hey Luke! Thanks for all your energy contributing here and figuring this out.

I just want to check in: This is enough better than the WORKSPACE approach for you that it's worth bifurcating the install path, yeah? I hear that you're looking to switch, but many high quality-dependencies (plus bazel itself) still require the WORKSPACE, and this seems like about the same install difficulty.

If so, I must say, I like this a lot more, actually, than the usual central repository approach. This way we're not submitting--and waiting on-- a million PRs to them every time we're updating. People can still stay up to date with Renovate (I'm pretty sure it now does git override) and we're not brought in transitively by default. Win-win-win.

^ That is to say, I'm a bit torn (please convince me if you think I'm wrong)--but presuming we should support bzlmod at some point, this seems like the way to do it, and I really appreciate your research and work.

Assuming you're still convinced of it: I think I should be asking you to take one more pass to see just how good we can get it, and then we'll merge. As a non-exhaustive set of things that seemed worth considering on a quick read: it looks like some of the arguments above are optional and set to their defaults, and if so we should probably omit to keep things simple for people. We probably want some instructions in the main README ("if you'd really like to use bzlmod..."). And could I ask you to straighten me out on how these extensions interact with Bzlmod? I think we probably want the bazel_dep call to declare this as a dev_dependency to avoid it breaking things by being brought in transitively and skipping the override. Thoughts on git_override vs archive_override? (the WORKSPACE advice was always http_archive, but maybe they've fixed the speed issues.) Also, maybe we should be adding any future dependencies directly via bzlmod instead of the extension? What do you think?

Thanks again, Chris

Attempt3035 commented 1 year ago

Hey Chris!

Thanks for your message! This is my first PR, so definitely appreciate the feedback and pointers! To address the points you've made:

Is it worth it better than the WORKSPACE approach? In my opinion and from what I've read so far (I've only been learning Bazel recently, so there might be some points I'm missing!), the bzlmod approach is justifiable for the following:

Downsides

From my perspective and current understanding, this would be the way to integrate bzlmod, and yes, I'll be doing a tidy up pass on this PR to get it to what should be production ready. I think the following points are of important consideration:

Finally, it would be good to consider adding the project to the Bazel Central Registry once we have bzlmod support, it should require just a commit to here, after checking the guidelines and maybe some extra metadata files added to the repo?. That would quite literally bring adding this to a new project down to the single line: bazel_dep(name = "hedron_compile_commands", dev_dependency = True, version = "1.0.0") which is both amazingly tidy and also should be super easy to version manage, as well as with renovate. Thoughts on that goal?

One more thought: If someone was to use a git or archive override to import this into a project, and then had that project imported into another project, the override would be ignored according to the docs, which would result in a missing module error. As far as I understand, the only solution to that is to have this project in the Bazel Central Repository so that you don't need an override to access it. I'm not sure if this would actually end up an issue or not, but it seems like a limitation of bzlmod.

I've committed the slight code changes that should refine this, let me know your thoughts on all this and if you'd like me to update the readme and get it all ready to be merged, if you do think it's a good idea.

cpsauer commented 1 year ago

Luke, thanks again for your excellent work and scoping on all this. I really appreciate it. Let's go for it.

I'm right there with you and agree on almost everything, but with the following tweaks to my thinking. As always, please push back if you disagree. 1) I think this dev tool probably almost always should be a dev dependency. (That is bazel_dep(dev_dependency=True...) Why? It's used for only for development. The output can't be easily chained. etc. Action: bazel_dep(dev_dependency=True...) is probably how we should be recommending use to people. 2) I think that, for now, we should steer clear of the central repository. Why? Means we can develop faster and live-at-head. There are a couple more lines for people to copy in, but people are probably copy-pasting anyway and it does seem like Renovate will still work. (I searched their codebase and found a recognizer for git_override; but if you end up using the tool this way, I'd love it if you'd give it a try after merging and double check.) It means we don't have to wait on external PR acceptance to get fixes out to people. And currently it is, I think sadly, fixing and merging speed that's the bottleneck here, more than it is people copying in those lines. And because of (1), there's not a transitive problem. We can always revisit once bzlmod rollout is further along, the Google live-at-head repos have started using bzlmod (rather than e.g. starting and then abandoning), and I have experience with the central repository on some of the less-widely-used projects we share. Action: Take exactly this great git_override route you've found and proposed :)

Additionally, I love your clever workspace dips unification trick. I've tested that it doesn't crash back to 5.0, which should be old enough for us. Even if we aren't really using it just yet, let's keep it in case we need it.

And I know I'm flip flopping, but let's just recommend git_archive if you're finding it plenty fast enough. I tried to quickly check out the bazel implementation, but it's pretty layered, and git_archive should be easier for people because they don't need the second, hash-copying step. Better to recommend just one good way of doing things.

Thanks again for all your great work here. I'm hoping you'll do one final round of polish for us on the above, and then I'll do and merge, hopefully not offending you with any edits I make. I can either just merge and then you tell me what could be better async, or I can make the changes I think and then show you before merging, just lmk. Or if you're uncertain about anything I can just do. But not tonight--need to go to bed now.

Cheers to helping with the building of great software, Chris

Attempt3035 commented 1 year ago

Hey Chris! Thanks so much for the support on this!

For 1) , you're absolutely right, we should definitely be recommending as a dev dependency, I was testing using that but forgot to add it to the comments in those files as I was going to transpose it properly when I did the readme. Will make sure it's done like that!

For 2), that makes sense to steer clear of the repository, and if renovate is working with git override that should be perfect. git override is also neater than archive override as we don't have to strip paths or anything, so yes, I agree let's just recommend that way only in the readme. The strongest argument I had for the repository was that I was under the impression there'd be a transitive dependencies issue using the overrides - let me explain: In the docs, it states that overrides such as git override are ignored when the project is included as a dependency of another project. Consider the scenario where someone depends on another project that also uses this dependency. I first assumed that would lead to an error, as the git override ie

git_override(
    module_name = "hedron_compile_commands",
    remote = "git@github.com:Attempt3035/bazel-compile-commands-extractor.git",
    commit = "d008f00d54ba46acf879385d12ffdd7494bf1ba5",
)

would be ignored in the dependency project, leaving just the bazel_dep(name = "hedron_compile_commands", dev_dependency = True) (sidenote, we have excluded the version parameter as we are git overriding, it's redundant) That dependency on it's own would fail to resolve as it's not in the central registry. However, upon testing, I realised the obvious flaw in my logic was that as bzlmod loads all dependencies in as modules, calculates the required deps using MVS (as opposed to deps being loaded in through macros), which means that the dependency is recognised as the same, and the topmost MODULE.bazel that defines the override will use that override throughout. That means it works perfectly as we'd want it to. Now that I think of it, MVS version selection won't actually work correctly with this setup, as even though modules can be versioned in their MODULE.bazel file, the topmost git override will always win, rather than resolving the version requirements from all deps and getting the highest required one. In our case though, that probably doesn't matter at all as it's a dev dependency and users are only likely running it on the top level of the project, in which it would still generate build commands that involve other modules they want to use. So just an interesting sidenote, but yes, we should definitely continue our path with the git override to add.

Great to hear you liked the unification trick, and thank you so much for backwards testing it!

Definitely agree with sticking to just git override, I actually didn't get the archive hash working on bzlmod yet either, I think they changed the syntax a bit because sha256 = "hash" isn't accepted, and it gives me invalid hash with integrity = "hash", even though the verbose message says to add the former. Probably not fully documented or I missed it.

I'll definitely go through with another commit, probably tomorrow or as soon as I get a chance, then let me know if there's any other thoughts you've had, otherwise of course go for gold tweaking it before merge, as I think I said earlier, I'm still pretty new to big production environments and community code, so I'd love to get the feedback via any changes you make so I know what to aim for next time!

Thanks for all the help with this, I'm really glad I can contribute a little bit, this project is exactly what I needed for a bunch of my other works and I massively appreciate what you guys have done for the community!

Attempt3035 commented 1 year ago

Will also set up and try out renovate when I get some free time, will confirm with you that it all works with this setup!

cpsauer commented 1 year ago

Really appreciate your tracking down the edge case and sharing! If you still have the project handy, one more key thing to check would be to make sure there's not an error if the top level project doesn't use this tool, but one of the dependents does (as a dev_dependency). I'd hope they got that right, and if so, we're in the clear--with a very clever, better solution by you :)

[I'm assuming the sha/integrity hash stuff is just rough edges because bzlmod is still prerelease, but I appreciate your giving me a heads for the future. Looks like they're tracking at https://github.com/bazelbuild/bazel/issues/17803, which I've subscribed to. Linking because you might want to, too.]

If useful to you, here's an example renovate config from one of our other projects: https://github.com/hedronvision/bazel-cc-filesystem-backport/blob/main/renovate.json5

Forward looking: Sounds like a plan! You're already great and thorough and exploratory and just generally awesome to work with. So continue with confidence :) Thanks for being great!

Attempt3035 commented 1 year ago

Good point to check, I've just tested now, and as expected, having it as a dev dependency makes all the difference. If it's in the sub-project without being a dev dep (and the parent project doesn't use it), the override gets ignored but the dep isn't found and throws an error. Once dev dep is enabled, it just gets ignored entirely in the sub project which is what we'd want!

Thanks for linking that issue too, that's exactly what I found happened when I tried as well 😝

Thank you for linking to the renovate config too, I'll have a go at setting that up today! On that note, do you have any suggestions of tools I should look into for c++ development? I was looking for some new ways to do things (which led me here in the first place) and am now using vscode, Bazel + this, clangd, sonarlint, and will shortly learn renovate and get into github actions. Is there anything you'd recommend I look into that could be handy?

Attempt3035 commented 1 year ago

Also quick question for the readme, for a git url, is it better to use the https:// or the git@github.com as the remote? Both work, but I guess https would be more universal than ssh?

cpsauer commented 1 year ago

Yay! Thank you!

Tools: Depends a bit on what you're trying to build! Some things I like, but that aren't too C++ specific are:

I think https--I know GitHub has switched over to steering people towards that for a variety of reasons.

Attempt3035 commented 1 year ago

Thank you so much for those! I've committed the most recent changes we've discussed now, let me know if there's anything else you think I should do!

cpsauer commented 1 year ago

I gave it a once-over to see what polish I could add! Will tap out what went through my head and then merge. (Please let me know if you see anything else we can and should fix or any mistakes I made, either here or as additional PRs.)

Attempt3035 commented 1 year ago

Looks perfect! Just looked through your changes, makes sense and definitely cleared up documentation🤝👌

cpsauer commented 1 year ago

A good chunk of the commit was me improving older mistakes of mine that I happened to see while reading through :)

I moved the local development stuff to the ImplementationReadme, since it seemed like it fit even better there.

I generally made any improvements to the comment or structure that occurred while I was reading through. I'm sure they could be even better still, so please, don't hesitate to toss up changes when you see improvements.

And (in a bit of a change of heart) I moved the bzlmod instructions ahead of the WORKSPACE ones. Thanks for pathfinding us such to a great solution--and again, being a joy to work with.

Now let's get you on that contributor board!

cpsauer commented 1 year ago

Actually, could I ask you to look into one more thing for me: If you remove the version = "1.0.0", in MODULE.bazel, do things break on when used from another project?

If not, I think we should cut that, since we're living at head rather than using versioning. Obviously, if that'd break things, then let's leave it :) I imagine it breaks, which is why you added it--but it'd be awesome if we could also add a comment noting that.

Attempt3035 commented 1 year ago

That's really helpful to see your thought pattern with the edits, thanks! I do agree that the info you moved definitely belongs in the ImplementationReadme, but I can also say I completely didn't know about that doc until seeing your commit😅 Not sure if there's a good way to improve visibility, although I'm sure I would've found it eventually haha. I was also a bit surprised about seeing the order swapped on bzlmod vs workspace😆, but I do agree, anyone setting up an entirely new project and adding this probably will be considering getting right into bzlmod, so it fits well :)

As for the version tag, I definitely had a think and play around with this, and what I found was: It still works fine even in sub-projects AS LONG AS dev dep is true - It seems to ignore it straight away when that's on, which is good Yes, it does break in the case that dev dep is false, as the override doesn't get applied and it wants to look for it in the central registry - which also means it will fail even if the version number is there - as it's not in the registry. So it would fail either way. Of course, this could be an issue if we did want to add it to the registry, but, in that case, we'd also need to start doing some versioning, meaning there's not much point adding a version number in when we aren't versioning yet, because once we add to the registry older projects would start getting an old version that might not be in there at all, and still throw errors... So, provided we are 100% happy moving forward with the git override, we shouldn't need the version tag and should avoid it until such time as we add to the central repository, in which case to move to that method would require changing the dependency line of code with a relevant version anyway, (although, I'm not sure then if we'd end up with a situation where a sub-project didn't have a version number but the master project did, if that would throw an error with the MVS algorithm. Maybe we should leave a minimum version that's lower than any we'd use in there, like 0.0.1, so that if we added to the central repository, non-updated sub-projects would still have the dependency resolved to the version the master project uses??)

Thoughts?

Also, I just did a quick test and it looks like the MODULE.bazel actually works without the version number, so maybe we should remove it from there too, and just go for a no versioned releases methodology until such time as we need to? Alternatively, did you want the MODULE.bazel set to version 1.0.0 maybe and in the docs we add the import statement with 1.0.0, then we can increment if we ever add to central repository?🤷‍♀️🤷‍♀️

Edit: Sorry, I read your original post to be about the removal of the version number from the import statement bazel_dep(name = "hedron_compile_commands", dev_dependency = True)

cpsauer commented 1 year ago

Thanks for experimenting around--and sorry for being ambiguous. I meant the version in the module() call in the MODULE.bazel file being added by this commit.

Sounds like that works, so I'll pull it out and merge!

cpsauer commented 1 year ago

Yay! Luke, welcome to the contributors board and a huge congratulations on your first ever open source contribution. You did it in in style, scoping a new feature, coming up with a significantly better approach, and then bringing it to completion.

That's a pretty huge milestone. You should be proud of yourself! Thanks again for being great to work with.

Hopefully it's empowering and the first of many! Know that you're always welcome here and that I really appreciate it. Please don't be a stranger. Thanks for leaving things better than you found them!

Attempt3035 commented 1 year ago

Thanks so much Chris, It's been so great working with you, and I'm so glad I could make a contribution to this awesome project!

Cheers to open source, thank you, and happy developing! Catch you again soon for future improvements!