loot / loot.github.io

The website and meta issue tracker.
https://loot.github.io
GNU General Public License v3.0
12 stars 9 forks source link

Improve masterlist validation #43

Closed Ortham closed 9 years ago

Ortham commented 9 years ago

The automated masterlist validation system that's implemented using Travis CI is pretty useful, but it does suffer from some shortcomings:

  1. It's not very precise. Due to the shortcomings in the validator it uses (Kwalify), it can't test:
    • the validity of structures which can be scalars or maps,
    • condition syntax
    • other restrictions, eg. no dirty info in regex entries, only one exact matching entry per plugin
  2. It's not versioned, so doesn't follow metadata format changes.

The result is that a commit can pass the validation, but then fail when used in LOOT itself. I'd like to bring the validation into line with LOOT (and have it complain about things LOOT won't, like multiple exact entries for the same plugin - LOOT just ignores all but the first).

The obvious solution to the first point is to use LOOT's own metadata parsing to validate the masterlists, so that the validator acts exactly as LOOT does. While there may be existing validators other than Kwalify that can handle all the YAML syntax permutations, the only way to test condition syntax and so get a complete validation would be to roll our own.

A validator would be trivial to write, though versioning it would make it a bit complex to deploy. When a new version of LOOT is released, the corresponding validator executable could be hosted as one of the GitHub release files, and the masterlist Travis configs could be updated to download that executable. This would save having to rebuild unchanging code whenever the masterlists get updated.

For WIP versions of the metadata format, the corresponding masterlist branches could have their Travis configs updated to build the latest validator code, then switch to the release binary once it's available. For example, right now there are v0.8 branches because although v0.8 hasn't been released yet, the latest LOOT code now uses v0.8 of the masterlists. So I'd edit the branches' Travis configs to point to the latest validator code for now, then change them again when v0.8 is actually released.

I need to look into how best to manage building the latest validator code though, because I don't really want to be copy/pasting the build process into multiple repositories. Probably the best way to do this is move the install section lines into a separate shell script, and download and call that in the masterlist repositories. Everything else must either be in the Travis config file or is simple and unchanging enough for copy/pasting.

@Freso: You did the current validation implementation, does the above sound sensible to you, am I missing anything?

Ortham commented 9 years ago

I wrote the validator, and tested it with a valid masterlist (4 min 11 sec) and an invalid masterlist (4 min 45 sec).

I've still got to tidy things up and split the .travis.yml, but once I've done that and merged the changes into the dev branch, I'll give it a real test on the Skyrim masterlist repository.

Ortham commented 9 years ago

The Skyrim masterlist's v0.8 branch is now using the new validator. I'll probably switch the other masterlists' v0.8 branches over to the new validator tomorrow.

I'm having a bit of trouble coming up with a good way to get a metadata-validator binary online for Travis to use. Attaching it to a GitHub Release seems to still be the best way, but I'm building Windows binaries on Windows, while Travis needs a Linux binary. Travis does have the ability to deploy to GitHub Releases on tagged builds, but I'm not sure what form that deployment takes, I'll have to do some experimenting. Otherwise, I'll have to build the binary in a virtual machine, which probably involves jumping through more hoops, and I'd prefer an automated procedure.

Ortham commented 9 years ago

After some testing of Travis' GitHub Releases deployment feature, I've found it fits the bill.

I've updated the loot/loot repository Travis config so that it will deploy a Linux metadata-validator binary archive to a GitHub release in that repository whenever a tagged build passes its tests (which a tagged build always should, being a release). This means that a new validator will automatically become available for each LOOT release, and the masterlist repositories just need to be updated to use it.

I set up a temporary repository as a fake copy of loot/loot, and committed to a new branch in the Skyrim masterlist repository, changing its Travis config to download the deployed binary instead of building its own. This worked, and the build ran in under a minute. It's still ~ 6x slower than the Kwalify-based validation, but I'm happy with that, and there's no way to speed it up further without massively slowing down loot/loot builds (the Boost libraries could be statically linked, but only if they were built from source).

I don't want to merge in the Skyrim masterlist repository change, since it depends on a temporary repository, so here's the Travis config ready for copy/pasting into the masterlist repositories:

sudo: false
language: ruby

# Need Boost 1.55, which isn't in the 12.04 repositories - install from a PPA.
addons:
  apt:
    sources:
      - boost-latest
      - ubuntu-toolchain-r-test
    packages:
      - libboost-log1.55-dev
      - libboost-date-time1.55-dev
      - libboost-thread1.55-dev
      - libboost-filesystem1.55-dev
      - libboost-locale1.55-dev
      - libboost-iostreams1.55-dev
      - g++-5

install: wget https://github.com/loot/loot/releases/download/GIT_TAG_NAME/metadata-validator.tar.gz -O - | tar -xz

script: ./metadata-validator $TRAVIS_BUILD_DIR/masterlist.yaml

notifications:
  irc:
    channels:
      - "chat.freenode.net#loot"
    use_notice: true
    skip_join: true

The GIT_TAG_NAME should be replaced by the actual Git tag name.

Ortham commented 9 years ago

All masterlist repositories are now using the new validator on their v0.8 branch, so closing this.