facebookarchive / facebook-clang-plugins

Plugins to clang-analyzer and clang-frontend
MIT License
482 stars 85 forks source link

Github Actions workflow for CI/CD #20

Closed Amar1729 closed 4 years ago

Amar1729 commented 4 years ago

Now that Github Actions is a lot more mature, try to use it for building clang. This could allow facebook/infer to use the output clang artifacts for also testing builds of Infer with C plugins enabled (previously disabled because clang builds take far too long).

Old work in #18 - tried to use Travis, but clang takes >1hr to build so workflows time out, figured it wasn't worth the hassle.

This workflow is WIP for two reasons.

  1. How often to run?

I'm not sure about how often maintainers would like it to run. The build job ONLY builds clang using clang/setup.sh, and this repo's custom clang is changed infrequently. I set up logic in this workflow to only build the custom clang anytime files in the clang/ directory are added, modified, or removed. However, maintainers may prefer rebuilding on each commit (probably a waste) or building on a schedule e.g. using cron to build every week in addition to whenever clang/ files are changed.

  1. Release uploading

This workflow currently only uploads an artifact, but GitHub artifacts are apparently supposed to be used for storing files so that different workflows can access them (there is no API endpoint for getting "most recent artifact"). Instead, it would be better to upload to Releases (you could use create-release and upload-release-asset actions together) so that Infer builds could use a specific tagged version. That doesn't completely fit in your install process so I didn't want to make assumptions.

Uploading to Releases could also allow you to remove the "check if clang files changed" steps in this workflow - you could set it to build on only certain tags (e.g. clang-v*) and then tag a commit whenever you change clang stuff and want it to rebuild.

Amar1729 commented 4 years ago

Not sure if github actions will run CI if there's a PR introducing the file. You can see my testing runs here https://github.com/Amar1729/facebook-clang-plugins/actions?query=branch%3Aci%2Fgithub.

Amar1729 commented 4 years ago

I missed this, but apparently infer also uses GitHub Actions now also. Since that's the case, this work could be done as a workflow/job in the infer repo, or it could remain here.

If you want to start tagging commits that change clang, it might be better just to keep it here to avoid rebuilding clang on every infer commit unnecessarily.

jvillard commented 4 years ago

Hi @Amar1729, thanks for this PR, I think it's brilliant!

High-level note to start: I wanted to have something like this so this is great. The workflow I would like to have is that we have clang built from this repo (without the rest, i.e. the plugin and OCaml stuff), and "released" in some way, perhaps in the form of GitHub releases (as you mention, see also below). Then, on the infer side we grab the "release" and build the rest of the plugins + the OCaml stuff from within infer.

One potential issue there is that the plugin and clang need to be built on exactly the same toolchain. Historically we used to build clang+plugin "releases" to speed up building infer from source for that reason. Not sure if GitHub actions will force us in the same corner or not. The reason I would like to avoid building clang+plugin here is that there is a medium-term plan to move the plugins outside of this repo and into infer's (leaving only clang here), so it would be ideal if that split was already done that way.

  1. How often to run?

There's actually logic built into the repo for deciding if clang should be rebuilt or not. Maybe we can re-use that? More precisely, if you run ./clang/setup.sh --only-check-install, it will exit 0 if clang is up to date and with non-0 otherwise. This requires unpacking the previous build of clang first though, or at least the "installed.version" file it leaves in ./clang/ to track this.

High-level questions (apologies for my noob-ness in this matter):

  1. Where does the build artifact end up and can a script find out where is the artifact corresponding to a particular revision of the repo?
  2. Would it be ok to upload one artifact per plugin version, knowing that a lot of the time the artifact will be the same because clang hasn't changed? Basically to possibly facilitate 1, but maybe it's not needed.

I think you touch on this in your question about possibly making these GitHub releases. Maybe that's the way to do 1., i.e. link a tarball to a particular revision of this repo. I think I like this idea, when doing an infer build (so from the infer repo) we can then:

  1. download the latest release that is before the current version of the facebook-clang-plugins submodule (some scripting can probably get us there)
  2. unpack it (or if none, skip to 3)
  3. build infer. The current build system will already do the right thing of checking the version of the installation of clang against what it expects (using ./clang/setup.sh --only-check-install) and rebuild clang if it has changed since (well, or I guess it should abort in that case and wait for a release to materialise for the right version).

Ok I don't think I've answered your questions but let's start with that as this message is too long already.

Amar1729 commented 4 years ago

@jvillard glad to help!

One potential issue there is that the plugin and clang need to be built on exactly the same toolchain.

I'll look more into the runs-on: key and see what toolchains are used on each underlying OS. That might help inform how to specify the values for this and the infer builds.

  1. regarding ./clang/setup.sh --only-check-install

When I was setting this up, I naively thought that checking whether files had changing in the current commit would be easier, since it would not require caching the installed.version file the script generates. I'll redo that logic using the setup script, hopefully it will make the workflow cleaner.

High-level questions

The build artifact remains as a by-product of the build. In particular, see this run from when I was testing this. You can choose to delete the artifact[s] of a run at a later time.

*edit - I deleted this artifact after the PR was merged.

I think that right now, GitHub Actions artifacts are NOT meant to me easily find-able by revision.

Seems like there are recent workflows and artifacts endpoints that you can somewhat do this kind of filtering/get latest with, but it's not full-featured in actions/download-artifact yet.

I think you touch on this in your question about possibly making these GitHub releases.

That's at least what i was thinking originally, since (from the history of this repo) it didn't seem like clang changed too frequently to just tag a release for it every time you do. If we go this way, I think we could probably have this Action build clang on every commit for CI and not upload it as an artifact (1.5gb byproducts on every commit? crazy!), and only upload it as a release on appropriately tagged builds.

jvillard commented 4 years ago

I'll look more into the runs-on: key and see what toolchains are used on each underlying OS. That might help inform how to specify the values for this and the infer builds.

Maybe for now we can just test using the same distro on each side, eg "ubuntu-latest", and hope for the best. Just wanted to point out a potential pitfall.

checking whether files had changing in the current commit would be easier

Actually you may be right, feel free to ignore my suggestion.

[release stuff]

We can start by just building clang, then figure out something for how to release. It will probably need to be co-developed with the infer side anyway. I'm playing with making GitHub do infer releases on my infer fork at the moment (without clang, hopefully with this PR we can change that!).

Yeah the size of clang is insane. One point in favour of building everything on the infer side is that we can aggressively strip clang to reduce the size, though it's still almost 1GB compressed. We cannot strip all non-global symbols from the facebook-clang-plugins side because it will prevent us from compiling the plugin later on (if I remember correctly).

Amar1729 commented 4 years ago

Update for storing this as an artifact - apparently artifacts are auto-deleted after 90 days anyway. So that wouldn't work.

stripping clang

Doesn't setup.sh strip everything in bin/ and lib/ when it completes regardless of who's calling it? I'm pretty sure the logs for this clang build say everything was stripped.

release stuff

As an aside, there's https://github.com/eine/tip which could work for keeping a single pre-release of this repo updated with clang assets.

jvillard commented 4 years ago

Doesn't setup.sh strip everything in bin/ and lib/ when it completes regardless of who's calling it?

You're right. I don't remember what was the problem then. Carry on :)

tip

Pretty cool but I wonder if GitHub will be happy for us to store one (well, two with osx) clang tarball per commit. Or, if we can re-use one upload for multiple releases (as it rarely changes).

Amar1729 commented 4 years ago

So as far as an action to download an asset from a release of a different repository, I've really only found this:

https://github.com/dsaltares/fetch-gh-release-asset

To get it to work for infer's CI to download clang from here, I think the best option for releases moving forward would be to tag any commits in this repo that are used at any point as the submodule version in infer (e.g. v$(git rev-parse HEAD)).

Then, we could run the upload release step here on any appropriately-tagged commits. GitHub Actions would only create a new release for tagged commits, and the upload step could use a consistent naming scheme for the release: then infer's CI could use this repo's naming scheme for releases to download the appropriate release asset. (Unfortunately, this creates a weird "reverse dependency" for this repo on infer, where you'll have to be informed every time infer's submodule of facebook-clang-plugins is updated).

If you don't want to go with tagging "releases", I think you may have to deal with some heavy scripting in infer's CI to find the correct release to download (like you said earlier, it would involve finding all releases and comparing their SHAs to the current SHA of the facebook-clang-plugin's submodule).

I would assume the tag-to-release version of this is a "cleaner" overall approach, but I don't know if that will work within whatever your internal release/upload process is.

Amar1729 commented 4 years ago

Due to the difficulty in using actions/create-release to upload to a new release from a matrixed build, I'm using https://github.com/softprops/action-gh-release instead.

Current logic:

  1. if current commit is not tagged / tag doesn't start with v.:
    • get list of changed files
  2. if current commit is not tagged / tag doesn't start with v.:
    • check if any changed files are in clang/
  3. if clang/ was changed, or if the commit is tagged and the tag starts with v.:
    • build clang
  4. if the event was a push (not pull_request), and clang was built, and the commit's tag does NOT start with v.:
    • tag the current commit as v.{sha}
  5. if the event was a push, and clang was built:
    • create a release and upload the clang artifacts to the new release
    • release is linked to the tag v.{sha}
    • release is named "Release v.{sha}"

Pretty sure the logic's right. You can see the runs here and the current releases/tags on my fork that have been pushed from Actions here.

I can remove the the "Create Tag" step if you don't want tags automatically created. The benefit of it is that as the workflow stands, it'll build clang on any commits that change it, in addition to you being able to "manually" build clang by just pushing a tag v.{sha} anywhere.

Side note: on my page, the releases are not displayed in chronological order. I'm not really sure why this is happening, because the tags are associated with commits that have dates, and that's what releases should be ordered by ... GitHub still knows what the "latest" release is though, and I figure this won't negatively impact how infer's CI will eventually pull releases (probably based of the sha of the facebook-clang-plugins submodule).

Second side note: I've been "building" clang by just creating empty files inside the clang/install/ directory to reduce testing time, since it worked the couple times I tested it. I'll have to change the "Build Clang" step back to actually doing something before this merges.

jvillard commented 4 years ago

This looks great, thanks! Let's merge this. A few minor comments inlined.

jvillard commented 4 years ago

Can you squash the commits together or will GitHub automatically do that on merge? EDIT: pardon noobness, I can do that on merge, thanks @XVilka

XVilka commented 4 years ago

@jvillard GitHub can do that: image

jvillard commented 4 years ago

Thanks, @XVilka!

Amar1729 commented 4 years ago

Cool, I didn't realize github could do that from the website either! Squashed them anyway.

Amar1729 commented 4 years ago

@jvillard , now that this PR is merged, I think you can manually tag this commit of fb clang as v.<sha>. Manually tagging it should (?) retroactively force actions to build a release for that commit, and then infer's workflow can be updated to include clang.

jvillard commented 4 years ago

Tagged: https://github.com/facebook/facebook-clang-plugins/actions/runs/143786213

jvillard commented 4 years ago

Compressing with xz instead of gzip make the (linux) tarball 34% smaller: 313MB instead of 477MB. Maybe worth doing?

Amar1729 commented 4 years ago

good find. i'll take a look.