GreyCat commented 7 years ago

I have yet another huge, but pretty helpful proposal. The proposal is to get ourselves a better CI.

Problems with current CI

There is one main huge problem: it's monolithic:

Check out everything
Build compiler
Build test .ksy files => target languages code
For every target language: 4.1. (If it's compilable language) Build: 4.1.1. Compiled target languages code from tests 4.1.2. KS runtime 4.1.3. Actual test specs (i.e. stuff with "assert equals") + test runner (sometimes) 4.2. Run tests (doing assertions), generate some sort of report
Aggregate all the reports, generate CI report page
Upload and update CI report page on our website

This leads us to:

Steadily increasing build times. All stuff builds sequentially, every new language means more time to build. Originally, we had 3.5 minutes per CI iteration, now we're steadily approaching 8-9 minutes per build.
We're actually abusing Travis infrastructure a lot for that one. Generally, in Travis, a project is supposed to use just one programming language (they supply heavily modified build environments with all the stuff pre-installed), thus we do lots of hacks and intricate installs (which also take precious time) to do such stuff.
We're unable to do multiple checks properly. For example:
- Testing C++ code is a huge deal actually. To do it properly, we need to run at least against 3-4 major compilers / OS combinations (i.e. gcc / clang / MSVC + Linux / Windows / OS X). This is possible to do with Travis's environment matrix + AppVeyor's Windows builds, but it really needs to be modular for that.
- Testing ksc properly: we're currently not testing that Linux vs Windows builds of compiler produce the same results, not to mention that we don't actually test JVM vs JS builds of compiler. We don't test OS X at all. That, again, would need some interaction between Linux (Travis) vs Windows (AppVeyor) builds.
A change in tests requires us to rebuild everything starting from the compiler — although in practice, if would be much faster to just get pre-built compiler from previous iteration and re-use it.

etc, etc. So, bottom line: monolithic = bad, modular = good.

KOLANICH commented 7 years ago

63

GreyCat commented 7 years ago

@KOLANICH Sorry, I don't understand you. Right now I've just described current state of things, there are no "build packages" right now. Besides, GitHub "Releases" stuff is not based upon uploads anyway — they're generated automatically from repo tags and are source-only.

LogicAndTrick commented 7 years ago

FYI you can attach binaries to a release by editing a tag in the releases page.

KOLANICH commented 7 years ago

@KOLANICH Sorry, I don't understand you. Right now I've just described currente state of things, there are no "build packages" right now.

I was a bit wrong. I have created an another issue #63, but it is closely related to this one, because you can build the modules separately (with every module having own travis build script), and then fetch the results from Releases pages and reuse them. There are some problems with module dependencies, but it can be solved by putting dependencies description into a separate repo. 1 travis build script fetches dependencies repo 2 travis build script builds and tests its targets 3 if there were no errors on the previous step it builds the packages and uploads them 4 It makes a dummy push into every repo dependent from the built repo to make them to be rebuilt and retested with Travis.

FYI you can attach binaries to a release by editing a tag in the releases page.

I propose to do it automatically on every successful Travis build.

GreyCat commented 7 years ago

because you can build the modules separately (with every module having own travis build script), and then fetch the results from Releases pages and reuse them.

Sorry, I don't quite understand almost everything you're mentioning in this paragraph. What are the "modules" and "dependencies" you're talking about? Why is that a problem in the first place?

FYI you can attach binaries to a release by editing a tag in the releases page. I propose to do it automatically on every successful Travis build.

This is pretty much pointless. Travis does mostly unstable builds and releases are for stable (tagged) builds. While it is possible to attach only "tagged" build files to releases, it is probably pointless anyway, as lots of release artifacts (.deb repo files, Ruby .gem, Python packages, etc), must be published in designated places, and we already do all that.

KOLANICH commented 7 years ago

Travis does mostly unstable builds and releases are for stable

GH Releases are for whatever the repo owner wants.

Why is that a problem in the first place?

The problem stated in the first post in this issue. The solution is to divide ks compiler from ks runtimes and put them into separate repos and biuld and test them separately.

What are the "modules" and "dependencies" you're talking about?

So the module is a separate git repo with a standalone part of ks, dependencies is what depends on what. Runtime library depends on compiler - if the compiler changes its interface, the runtime library also is needed it be changed. So every compiler change requires to run tests for every runtime library using updated version of the compiler. We don't want to store this data in compiler repo so we should create a separate repo for dependecy description. When a runtime updates you only need to recheck that runtime. In this case you can take prebuilt and tested compiler binary and use runtime with it without retesting compiler.

ghost commented 7 years ago

I have found an interesting example - .travis.yml for the ANTLR project. As Kaitai Struct the ANTLR project has runtime libraries for different languages (C#, C++, Go, Java, JavaScript, Python 2 and 3, Swift). ANTLR tool generates parsers and lexers, and then the generated parsers and lexers use the runtime libraries (the same principle as in Kaitai Struct). The .travis.yml calls scripts from the .travis directory. May be it may help somehow.

GreyCat commented 7 years ago

The time has (kind of) come: given that we'll need pretty sophisticated system to test writes (for #27), I've decided to take a few first steps.

Initially I had this idea of the workflow:

CI flow graph

I've started from running the actual tests:

Our main process now published compiled target language tests in a distinct repo for that purposes
As soon as any commit lands there, Travis launches several jobs in parallel according to its .travis.yml. It looks like this.
So far, I've implemented support for 4 languages: C++, Java, Python, Ruby.

To add new languages, the following is needed:

Add ./prepare-$TARGET script that will, at very least, download language-specific KS runtime (probably to runtime/$TARGET), and, probably, install some dependencies
Add one or more of relevant environments to matrix section of .travis.yml

The output is not saved anywhere so far. The next step is, obviously, publishing test artifacts, i.e. JUnit XML-style or whatever reports they can provide.

Even this PoC Travis run uncovered a few problems with our current build:

Our C++ tests are not compatible with (older, probably) clang, due to usage of 0.25d for double literal
We broke Ruby 1.8 compatibility with way too liberal debug mode usage of 1.9+ features
We broke Java JDK6 compatibility when we've switched to use java.nio for input

GreyCat commented 7 years ago

Tried to get Appveyor to build C++ using MSVC compiler: https://ci.appveyor.com/project/GreyCat/ci-targets

Wow, how naive I am. Right now it fails to run due to Boost (and Boost.Test) + zlib being unavailable. Is there a simple way to install boost / zlib on Windows?

@LogicAndTrick Probably running C# on several .NET platform on Windows would be possible now — wanna take a look? I can add you to Appveyor account.

LogicAndTrick commented 7 years ago

Sure, I'll see if I can get something running when I have time.

GreyCat commented 7 years ago

@LogicAndTrick I've tried to add you by e-mail. Hopefully you'll receive some invitation or something?..

LogicAndTrick commented 7 years ago

Looks like a few versions of Boost are installed in the AppVeyor image: https://www.appveyor.com/docs/build-environment/#boost You might need to set up an environment variable to point to one of those paths. I don't know about zlib though.

koczkatamas commented 7 years ago

Judging from appveyor config files on the internet ( eg. https://github.com/libgd/libgd/blob/master/appveyor.yml ) we may have to install zlib manually.

(It's a bit weird though a lot of projects are using zlib and it does not change that much, so you don't have to keep N versions. Maybe it worths to ask the appveyor guys to put it into the base image?)

GreyCat commented 7 years ago

I guess zlib is not that big of an issue (it's very small anyway), and, besides, we might want to test it on Windows with zlib disabled.

LogicAndTrick commented 7 years ago

Alright I've been experimenting with this and my scripts are not very good but they kind of work. Is this enough to start you off or do you need more info? I'm not really confident with this stuff so I'm probably doing some things wrong:

appveyor.yml file - This seems like a better way to manage the AV script, similar to Travis
- The environment matrix seems very similar to Travis, it creates an isolated job for each language/platform
- I was too lazy to mess with the tests repo so it's missing a cd tests at the end of the install script
run-csharp-dotnet-framework - uses Microsoft's msbuild and csc tools
- Will need to be moved into the tests repo once you're happy with it
- Seems that MSYS running in AV doesn't have the msbuild executable on the path, so it's hard-coded to the install location. I think if it was a powershell or cmd script, it would know about those variables. Could always add the directory onto the path, I'm not confident enough with bash to know how to do that properly...
run-csharp-mono - uses Mono's xbuild and mcs tools
- Same problem with the mono tools referred to by path. Could be a problem if Mono is updated on AV, but that hasn't happened in 2 years so it doesn't seem to change very often
Example build results
Test results are generated in the same place, but in platform subfolders (test_out/csharp_mono/TestResult.xml and test_out/csharp_dotnetframework/TestResult.xml)
- I assume you will want to add something to publish these results (AppVeyor artifact maybe?)
AppVeyor has built-in support for NUnit but I haven't tried it. Not sure if you can get the AV test report and get the xml file without having to run the tests twice (once using AV and once manually to get xml)
C++ will run too (but doesn't work right now because of the missing cd tests in the install script)
I can't seem to find a standalone csc type tool with dotnet core, so I didn't include it. Surely there's a way to do partial compilation, I will need to investigate more.
Currently the mono script will not work on linux because of the hard-coded path names. Should be fixed if the mono tools are put on the path
Mono script reports an error on the truncate command but I think this is because mcs reports both absolute and relative paths - the truncate works on the absolute path and then the error comes from the relative path. (Not sure if this is a big deal or not)

GreyCat commented 7 years ago

@LogicAndTrick Thanks for all that investigation, it will certainly help!

appveyor.yml file - This seems like a better way to manage the AV script, similar to Travis

run-csharp-dotnet-framework - uses Microsoft's msbuild and csc tools run-csharp-mono - uses Mono's xbuild and mcs tools

Cool :) The only thing probably worth moving to prepare-* scripts is nuget restore ... stuff, as it is technically an initialization, not test run.

My idea is that run-* scripts should be perfectly usable on normal developers' boxes, not only on CI servers. If it needs any per-installation configuration, we can always do it in something like a config file. "Normal" (i.e. usable by a developer) installation will use one config and CI run will just use another one (for example, to reference specific paths in AV images).

C++ will run too (but doesn't work right now because of the missing cd tests in the install script)

I actually doubt that. Your version doesn't differ much from what I've launched, and it fails, being unable to find Boost and Boost.Test in CMake setup.

I assume you will want to add something to publish these results (AppVeyor artifact maybe?)

Yeah, it's the common next step for all CIs (both Travis and AppVeyor). I was thinking of two obvious choices:

publish it into yet another GitHub repo
publish it somewhere at BinTray

Then yet another Travis job should trigger, pick up these artifacts and aggregate them to update CI page. Both these choices are actually pretty messy :( Registering yet another dozen of repositories just for the sake of storing test results feels lot like abuse of GitHub (and it's tons of work too). BinTray uses an extremely complex API, both to publish and retrieve files, which is a major turnoff for me.

Any other ideas?

LogicAndTrick commented 7 years ago

Could you use one repo with a branch for each target? A little messy, but it means you don't have to have a separate repo for each language. As for the reference paths, maybe some environment variables? e.g. MONO_INSTALL_DIRECTORY or something?

GreyCat commented 7 years ago

I've tried to do Bintray upload, and, after some experimenting, I'm tempted to say that it's mostly useless for these purposes: https://travis-ci.org/kaitai-io/ci_targets/jobs/220439314

It's very slow when uploading many files. Every file upload literally takes several seconds (and probably takes heavy toll on API usage). Uploading multi-file testing artifacts, like Java's, for example, takes like forever.
It is very inconvenient to download them all back for further usage. Again, it's not really good at handling multiple files, etc.
It has problems with files with spaces and that kind of minor stuff.

Could you use one repo with a branch for each target?

Yeah, I think that should work! I'll try it next.

As for the reference paths, maybe some environment variables?

Yeah, exactly :) Basically, that's what these config files are doing.

LogicAndTrick commented 7 years ago

I guess in this case these are variables that could change depending on the user's setup. Is that still okay to put in the config file, and expect the user the modify it if they need to? Right now the config variables are well-known (relative) paths, so they don't ever need to be changed.

I was thinking something like this (pseudocode):

# User modifies these if they want
MONO_INSTALL_LOCATION=/c/Program Files (x86)/Mono/bin/;
MSBUILD_INSTALL_LOCATION=/c/Program Files (x86)/MSBuild/14.0/Bin/;

PATH = $PATH + $MONO_INSTALL_LOCATION + $MSBUILD_INSTALL_LOCATION
# Scripts reference xbuild/msbuild/etc from the path
# If they are on the path already they'll "just work" even if the install locations are different from above

It feels a bit flimsy, but is there another way to do it? I don't think AppVeyor environment variables (Windows) will flow through to the MSYS environment so I'm not sure if there's a way to do it via the AV config.

GreyCat commented 6 years ago

I've got to do another approach on this issue, and I found out that, actually, there's a whole world of different CIs out there which support modular workflows/pipelines.

We have about a dozen or so of repositories, and they all should be built, tested and deployed in a complex manner. This implies an intersting difference: it would be highly beneficial for us to have a CI configuration not stored in a repository (along the with code), akin to .travis.yml file, but instead set up externally.

When orchestrating a complex flow/pipeline, there are a couple of key questions:

Is it possible to do flow with multiple repositories?
Is it possible to do complex flow with steps like a -> {b c d} -> e (usually CI guys call it "fanning in" / "fanning out")?
Is it possible to run flow partially, triggering relevant parts of it from a commit into one of the affected repositories (i.e. fix C# runtime → rerun C# tests only → update CI summary)
How a flow step passes the data to other steps
- build artifacts
- and some signal to trigger further steps
Is it possible to host artifacts inside the CI system, or should we publish it to some external service (i.e. GitHub, npm, etc)
Would it be possible to process pull requests / do test builds somehow? Can we isolate deployment secrets enough?

I'm currently checking out:

Self-hosted:

Things I've checked out and these probably not satisfy the criterias outlined above:

CircleCI — looks promising, but seems to be centered on "one repository" model :(
CodeFresh — one config per repo, very basic flows
SemaphoreCI — one config per repo, very basic flows (parallellism of tests)
GitLab CI — again, repo-centered, and offers basic flow features
Buildkite — repo-centered, novel "bring your own worker node" idea, but probably not the best fit for us

Ideally, I'd still like to stick to hosted infrastructure that someone else would support. But, if all else fails, I'm probably ok with hosting our own CI at some sort of generic server(s).

arekbulski commented 6 years ago

There is a drawback, you need to build compiler and example schemas on each CI serve instead of once. So each build gets shorter but addup to more in sum total.

GreyCat commented 6 years ago

Um, you've commented on some sort of earliers plans?

GreyCat commented 6 years ago

Ok, returning back to this, this time trying to complete it.

What's done already

First step(s) that build the compiler and use that do compile ksy → target languages
Second step is ci_targets repo, which gets new set of sources in target languages and kicks off many different languages compilation and validation in parallel, which, on completion, report their results to ci_artifacts.
Third step is ci_artifacts repo, which has many different branches (so they can be updated in parallel, independently of each other).

What's left to be done

Modify each of the test runners in second step to generate the same uniform format — "ci.sjon".
Start a new CI page, written in HTML+JS, which would:
- Download JSON results from raw github URLs like https://raw.githubusercontent.com/kaitai-io/ci_artifacts/java/oraclejdk7/test_out/java/ci.json
- Aggregate them in JS
- Format a nice table akin to http://kaitai.io/ci/

Anyone can lend a help with HTML+JS (Vue, JQuery, whatever you prefer) here?

GreyCat commented 5 years ago

Ok, a very rough new CI page, aggregating everything is implemented as http://kaitai.io/ci/ci.html — and we already support quite a few new & old combinations. Please take a look and tell me what you think of it.

Obviously, missing stuff is:

proper success percentage calculation
ability to filter columns, as it is quite obvious that there are more columns than a typical screen would allow us to fit — for now, one can open a JS Console (F12) and enter something like that to select columns:

app.__vue__.gridColumns = ["name", "cpp_stl_11/gcc4.8_linux", "cpp_stl_11/clang7.3_osx", "cpp_stl_11/clang3.5_linux", "ruby/2.3"]

If anyone proficient in (or willing to learn) Vue wants to help, I'd be most grateful ;)

GreyCat commented 5 years ago

Language convesion to new CI checklist for me to track:

[ ] construct2
[ ] construct3
[x] csharp
[x] cpp_stl
[x] go
[x] java
[x] javascript
[x] lua
[x] perl
[x] php
[x] python2
[x] python3
[x] ruby
[x] rust

GreyCat commented 5 years ago

csharp/mono5.18.0 and lua/5.3 was ported to new CI system this morning. This also paves the way to do more well-round testing for C# with other systems (i.e. on Windows, .NET core, .NET standard, regular .NET, etc).

Unfortunately, we'll be most likely dropping Construct support eventually, as the project itself seems to be abandoned :(

Need to double-check what's going on with go, and, phew, this looks like it's almost done.

GreyCat commented 5 years ago

Ok, go has successfully joined the company. Which cosmetically it's still clumsy, I guess we can consider this task done.

kaitai-io / kaitai_struct

Better CI: modular build system #62

Problems with current CI

63

What's done already

What's left to be done