dotnet / runtime

.NET is a cross-platform runtime for cloud, mobile, desktop, and IoT apps.
https://docs.microsoft.com/dotnet/core/
MIT License
15.19k stars 4.72k forks source link

Add root build definition which builds the entire repo #98

Closed ViktorHofer closed 4 years ago

ViktorHofer commented 4 years ago

Ported from @jashook's doc: https://github.com/dotnet/consolidation/pull/30

Overview

The purpose of this comment is to give a design of the CI build of dontet/runtime using the live builds of each folder under src. This replace the old package restore dependency flow between what used to be dotnet/coreclr, dotnet/corefx and dotnet/core-setup.

It is important to note that this focuses on the CI build of the dotnet/runtime product. This distinction is required because the developer worklow for the repository is different and is being added here: https://github.com/dotnet/runtime/pull/55.

Goal

We want to have a single pipeline that builds a minimum matrix of coreclr, libraries and installer for all paths in the repository excluding docs/*.

The simple justification is that our setup already mostly requires this. Coreclr changes have to build libraries to run tests. Libraries has to build CoreCLR to build and to run tests once we're live / live. The only real delta here is whether or not to build installers. That can be done in parallel with testing though hence it doesn't affect throughput.

The approach would be to condition a full platform matrix and condition test runs based on changed paths. There would be an early job which defines variables (i.e changedLibraries, changedCoreClr) and then we would condition build and test jobs based on those variables.

The downside to this approach is that a bunch of jobs will be shown as skipped when the path doesn't applies, but long-term we can request a feature that defines a variable at compile time with the changed paths and then we can condition at compile time rather than run time.

The advantages is that now we have less .yml entry-points to maintain and we avoid a lot of build duplication on our CI when a change touches multiple subsets.

Building

runtime/src/coreclr and runtime/src/corefx should each have templates which allow the following parameters, platforms: [<OS>_<Arch>], buildConfig [debug, checked, release], jobTemplate.

Sample yml:

- template: templates/platform-matrix.yml
  parameters:
    jobTemplate: coreclr/build-job.yml
    buildConfig: debug
    platforms:
    - Windows_NT_x64
    - Windows_NT_x86

    # Job parameters is a grab bag of extra properties.
    jobParameters:
      testGroup: innerloop

This will allow us to re-use the platform fan out template logic for the coreclr and libraries steps build.

This is about to be done for coreclr and libraries (https://github.com/dotnet/runtime/pull/294 and https://github.com/dotnet/runtime/pull/274).

Pipeline mock up

  1. Official Build: build coreclr -> build libraries -> build installer -> sign -> publish
  2. changedCoreClr == true: build coreclr (full matrix including checked builds) -> build libraries -> build libraries tests, build coreclr managed tests, build coreclr native tests -> run tests
  3. changedLibraries == true: build coreclr -> build libraries -> build libraries tests -> run libraries tests
  4. changedInstaller == true: build coreclr -> build libraries -> build installer -> build installer tests -> run installer tests

cc @dotnet/runtime-infrastructure

trylek commented 4 years ago

Adding @jkoritzinsky who is currently making progress on "live live build" as there seems to be quite a bit of overlap to prevent duplication of efforts.

safern commented 4 years ago

Assigning to me as I already started with this work.

Just wanted to make sure we're all in the same page as we've had super long discussions on Friday on what the plan is for this.  (You guys were OOF).

@jashook @sbomer and I, discussed on the structure for this and we think we shouldn't put an effort to put up a pipeline that builds the entire repo just yet as we should try and re-use as much as possible from the CI builds in order to parallelyze. Also @dagood was going to put a pipeline that uses the local live live logic that @jkoritzinsky put up together for one configuration in order to start the signing work and then the signing work should just be ported over to the "libraries" live live CI build as part of the official build.

As a result of that @jashook put up together this file: https://github.com/dotnet/consolidation/pull/30

ViktorHofer commented 4 years ago

Thanks for summarizing the status. Removing my assignment as work is already in progress here.

safern commented 4 years ago

I've updated the description to summarize the last plan that we discussed today in the morning.

Please feel free to update and comment with thoughts and opinions if you agree or disagree with something on that plan.

trylek commented 4 years ago

Sounds great to me. As to the nearest sequencing, my expectations are the following:

1) I'll rename the live-live pipeline from eng/pipelines/coreclr/pr.yml to a new name (my current plan is eng/pipelines/root.yml but it's certainly up to further discussion) and I'll put back the pre-existing CoreCLR pr.yml - this is also important to verify that my changes haven't broken the "legacy" runtime-coreclr pipeline; 2) I'll make my best to address the rest of PR feedback and merge the change in ASAP, ideally before EOD today; 3) Once the change is in, @dagood can start working on wiring up the installer logic; 4) In parallel, we can work on finalizing the test coverage extent and other details regarding switch-over of innerloop testing in the individual subrepos and an overall outerloop testing of the entire runtime repo to the live-live logic.

ViktorHofer commented 4 years ago

and an overall outerloop testing of the entire runtime repo

Hearing of that the first time. Do we actually want that?

jashook commented 4 years ago

I think I missed this as well. There are interesting benefits to doing this. We will de-duplicate the builds on the outerloop jobs in the same way as the PR. It would also remove the possibility that a change to one pipeline silently breaks another pipeline's outerloop testing.

It does have drawbacks though. Outerloop testing would be difficult to determine specific problems by CI council. In addition, metrics for test failures and other statistics for the pipeline, which are heavily used currently become noisier.

Lastly, we will most likely keep runtime.yml as close to internal.yml as possible. I think this naturally expands to wanting to share as much logic with outerloop testing as possible.

I personally think that machine utilization is important enough to deal with noisier analytics. As for tracking to 95% success rate of the pipeline. I think I am fine with tracking all four in the same pipeline, realistically we all have the same goal and it makes sense to have some sense of shared responsibility.

ViktorHofer commented 4 years ago

I wanted to summarize what we just discussed in the infra standup:

We should have four layers of builds:

dagood commented 4 years ago

I have a PR open for signed/official builds: https://github.com/dotnet/runtime/pull/1016. Still doing a little more validation on publish.

dagood commented 4 years ago

https://github.com/dotnet/runtime/pull/1016 is merged, and the first build is running here: https://dev.azure.com/dnceng/internal/_build/results?buildId=461841&view=results. 🎉

Now monitoring for publish results, and we'll be working on pushing them downstream.

dagood commented 4 years ago

Succeeded, but didn't publish to dotnetcli, only the blob feed (https://dotnetfeed.blob.core.windows.net/dotnet-core/assets/core-setup/Runtime/5.0.0-alpha.1.19618.2/dotnet-runtime-5.0.0-alpha.1.19618.2-win-x64.exe). Looks like https://github.com/dotnet/core-setup/pull/8426 never got ported to master. At least it's clear what to do from that PR.

~On the other hand, even though publishing didn't include all the endpoints, I would have expected auto-update PRs anyway... will look into why I'm not seeing any tomorrow. (I haven't reviewed the list of subscriptions yet or confirmed the build actually got on the right channel in BAR--not at my work machine.)~ We have auto-PRs going, just need to make publishing complete.

/cc @mmitche

dagood commented 4 years ago

It also looks like CoreCLR packages weren't published, seeing https://github.com/dotnet/winforms/pull/2533.

dagood commented 4 years ago

Fixes for dotnetcli publish (https://github.com/dotnet/runtime/pull/1092) and CoreCLR publish (https://github.com/dotnet/runtime/pull/1090) are now both merged. https://dev.azure.com/dnceng/internal/_build/results?buildId=463477&view=results is running with the fixes. I'm monitoring and will post an update once it's done. ~3-4 hours seems to be the norm.

dagood commented 4 years ago

Looks like it worked, for example this got published:

https://dotnetcli.blob.core.windows.net/dotnet/Runtime/5.0.0-alpha.1.19620.3/dotnet-runtime-5.0.0-alpha.1.19620.3-win-x64.exe

and Microsoft.NET.Sdk.IL@5.0.0-alpha.1.19620.3 was pushed to the dotnet-core feed.

But auto-update PRs weren't updated with this new build, not sure why. Will look into that when I have a chance.

dagood commented 4 years ago

Auto-update PRs were updated, but are in a bad state.

Filed https://github.com/dotnet/runtime/issues/1129 for aspnet/extensions and dotnet/toolset. Some native crypto issue.

dotnet/winforms is waiting for an update from repo maintainers. https://github.com/dotnet/winforms/pull/2533

RussKie commented 4 years ago

dotnet/winforms is waiting for an update from repo maintainers. dotnet/winforms#2533

The PR is still failing to build:

##[error]src\Accessibility\src\Accessibility.ilproj(0,0): error NU1102: Unable to find package runtime.win-x64.microsoft.netcore.ilasm with version (>= 5.0.0)
  - Found 1164 version(s) in dotnet-core [ Nearest version: 5.0.0-alpha.1.19627.5 ]
  - Found 792 version(s) in dotnet-coreclr [ Nearest version: 5.0.0-alpha1.19564.1 ]
  - Found 9 version(s) in nuget.org [ Nearest version: 2.0.8 ]
  - Found 0 version(s) in arcade
  - Found 0 version(s) in dotnet-eng
dagood commented 4 years ago

See https://github.com/dotnet/winforms/pull/2533#issuecomment-570955502 about the WinForms issue.

dagood commented 4 years ago

Status:

safern commented 4 years ago

Done by: https://github.com/dotnet/runtime/pull/1473