mrdoob / three.js

JavaScript 3D Library.
https://threejs.org/
MIT License
102.01k stars 35.33k forks source link

Automatic regression testing with CI #16941

Closed NicolasRannou closed 4 years ago

NicolasRannou commented 5 years ago
Description of the problem

Right now, it seems threejs is not testing the actual shaders code in any automated way and just relies on testing the examples and hoping that works out.

With the growing number of examples, that is quickly becoming too much.

Is there a plan to solidify this part of the testing pipeline, such as is done by https://github.com/GoogleWebComponents/model-viewer (who use threejs) or https://github.com/Kitware/vtk-js-datasets/tree/master/data/vtkjs ?

Thanks for the great library!

Mugen87 commented 5 years ago

Related #13017

At the end of the PR, the idea was raised by @mrdoob to build something based on Puppeteer.

gkjohnson commented 5 years ago

This would be fantastic -- I was hoping that #13017 would get merged but it looks like it got held up.

I've been looking into writing something for my own work to support CI and using Mapbox's pixelmatch package to perform the image diffs. If there could be any sort of plan or vision around this I'd be happy to help move it forward.

Mugen87 commented 5 years ago

If somebody wants to start with this topic, it would be great when the solution can automatically verify the official examples against an expected visual output. As a prerequisite, the approach should no require a modification of the example files.

fernandojsg commented 5 years ago

I have pretty much all the pieces to start working on this using a framework I was working on in the last months for exactly this type of tasks: https://github.com/MozillaReality/webgfx-tests This will help both with the regression testing on rendering (comparing with reference images) but also on performance regression. Adding it to three.js is on my to-do list hopefully I could get unblocked soon and start working on it soon :)

mikialex commented 5 years ago

I have done this last year: using puppeter and pixel compare for regression test in webgl framework. but not only just it.

Each example is an async function, declare which html temple used. The function receive a bridge object, you can request mount element from it, and do some thing preparation, setup renderer, do some render, Then, you can call framecomprare prediciton method from bridge, which will request puppeter do visual regression test in ci env. or do normal prediction, logging, After all, a decent html report is generated , every thing is clear and beautify. Meanwhile, you can also set render loop and config obj to bridge before test function return. Another build script will compile all things to an example website. In that envrionment, bridge's prediction function is slicenced, and as you have setup render loop, user can view interactive example on example page. Meanwhile, additional format in code comment can be another script extracted, which can generate a example guidance website for document purpose. CI regression test , interactive example site, code with comment with example document, do three things in one code~

Back to Puppteer, the hardest part is figure out async call between node ctx and headless browser ctx. You have to assure your test is over then resolve back to node ctx. I solved this by inject random uuid named function on window as Promise resolve reject function. When test script executor await script resolved, called window resolve function as the browser part is done. What is tricky is that part need build js code and evaluate.

Another things I did is to get coverage. although its not quite useful. There is a config can enable V8 native coverage collection, and you can use puppeteer-to-istanbul get decent report.

wish could provide some help

gkjohnson commented 5 years ago

Are there any thoughts on how to handle regressions for examples that are time-dependent or rely on randomness? The PCSS example generates random balls, for example, and any of the examples that rely on animations will depend on run time of the application before a screenshot is taken.

Mugen87 commented 5 years ago

Good point! Utilizing the official examples was actually the first idea that came up in order to avoid the development of new test cases exclusively used for regression testing. A simple solution is to just exclude certain examples from a test suite and only use demos that behave deterministic. However, we might get a bad test coverage when doing so...

fernandojsg commented 5 years ago

The framework I was working on take cares of that. It basically hooks all the APIs possible, for example math.random, performance.now(), date.now() returning always the same sequence based on a seed number that can be define by example. The same goes for the canvas size, as most examples set it to full window size, so depending on how big your browser window or monitor is, the example will be rendering at different sizes. I hook the canvas and webgl context for that and make sure each test run always at the same size, no matter your device. Also I added VR support, so you can set a static pose for the camera, or even replay some recorder movement. The same goes for the key and mouse input, you can play the example in record mode, na it will record every input and then replay on the exact frame you introduced it.

Basically I tried to ensure that every run is deterministic, otherwise it would be very hard to compare the numbers. I'm planning to do a video/post about the framework so you could better understand how this work before integrating it here. But basically when I was designing it I had three.js in mind too for the type of problems we have been having in the past in regressions. Apart from stats on cpu time/idle, fps, and so, it also records the webgl drawcalls, textures, programs, so problems as when the frustum culling was broken, will be quite easy to spot with it.

WestLangley commented 5 years ago

16655.

fernandojsg commented 5 years ago

@WestLangley exactly, pretty much something like that for the Math.random. It's also very important to hook the timers as they are used often to animate things Math.sin(t) or get pos.x+=delta, so I hook them all and return a constant incremental no matter how long your frame took to render, and so

takahirox commented 5 years ago

Regarding randomness, as I posted https://github.com/MozillaReality/webgfx-tests/issues/46 asynchronous function can cause a problem and I haven't come up with a solution yet.

For example, the following code (I see this style in Three.js examples) asynchronously loads an asset and then starts an animation.

init();
render();

function init() {
    scene = new THREE.Scene();
    camera = new THREE.PerspectiveCamera(...);
    var loader = new THREE.SomethingLoader();
    loader.load(url, function (asset) {
        scene.add(asset.model);
        mixer = new THREE.AnimationMixer(asset.model);
        mixer.clipAction(asset.animation).play();
    });
}

function render() {
    requestAnimationFrame(render);
    if (mixer) mixer.update(clock.getDelta());
    renderer.render(scene, camera);
}

Imagine that testing framework compares the rendered pixels of a certain frame (for example 500 frames past since an application starts up) with the reference image. The pixels can depend on the timing of asset load completion.

Any ideas to resolve this problem? Probably most of asynchronous functions used are downloading files and images so we monitor them and start to count frames when all downloading are completed?

fernandojsg commented 5 years ago

Yep I was working on an implementation based on something that the Mozilla Games team was doing on their testing basically what you said, you hook the fetch/xhr request and wait until they are done, you can configure time threshold on them to start or to wait for these requests. Also the test could generate config files based on the execution so it could know how many request the example will be doing and wait for them beforehand without need to "guess" them. The RAF loop is basically waiting until all the requests are ready, and it will start as soon as they are done

NicolasRannou commented 5 years ago

@fernandojsg is the framework you are working on publicly available or can we test it somehow? :) Looking forward to it!

jsulli commented 4 years ago

Any update on this @fernandojsg ? Would love to help get this online if there's any work that needs doing on it.

munrocket commented 4 years ago

@jsulli I am working on automatic testing here, also I have an example with CircleCI and Travis configs in another repo. Hope I will solve issue with exact timing this weekend. If you have an example with parallel testing it would be cool, because we have around 150 examples.

munrocket commented 4 years ago

So we need to fix time, fix delta between two frames, and invoke exactly the same amount of requestAnimationFrame.

moraxy commented 4 years ago

@munrocket since you're using puppeteer, have you tried HeadlessExperimental.beginFrame? it's kinda frame-exact rendering with user-controlled timings. I've played with it in the past, but something Linux+GPU related happened (I think) and I gave up. Maybe it's better supported now.

munrocket commented 4 years ago

Hard part is solved. Funny, but we also needed to reimplement Math.random in order to get same screenshots. Right now I'm just waiting N milliseconds for loading all resources on page.