More automated tests - Githubissues

GoogleCodeExporter commented 9 years ago

1. automatically collect screen captures, tracebacks and other info for all 
scripts in trunk/test

2. a script to compare info obtained in different sessions of 1. , say using 
the same tests but different cocos library revision.
This could help to catch unintended side effects early.

3. if by human inspection a status pass / fail is assigned to each test in a 
specific revision, say r1173, and the collected info for same revisions are 
kept, then for later revision it can be automatically reported
   + scripts that still pass (because same okeyed results)
   + scripts to watch (because they were ok but now results differs)

For this stage it is understood that comparison should be done between sessions 
in the same sw-hw testbed, else different openGL drivers, pyglet changes, etc 
would report false unimportant differences.
At a later stage, better image comparison algos can be tried.

Original issue reported on code.google.com by ccanepacc@gmail.com on 30 Mar 2012 at 9:34

GoogleCodeExporter commented 9 years ago

Preliminary work:

1. <collector>.py runs runs a loop where it calls <driver.py> <test target>.py

2. <driver.py> does a special launch of <test target> payload, collects info 
that sends to <collector> using PIPE.stderr

3. scene is ticked and a few snapshots taken from director.AutotestClock, a 
derivation of the custom  director.ScreenReaderClock used to record cocos 
scenes with a steady framerate

status:

simple exact image comparison script done

A single test-like scene with hardcoded capture mode done; captures are 
consistent between sessions. Automatic capture much more fast than visual 
inspection, albeit I don't expect TDD-like fasteness.

director.AutotestClock done, can be better

the autotest related parts in the scene script suggests the API between 
<driver> and the test script. 

sketchy collector - driver : launch, communication. No work done on storage.

Further Work
  + A module to characterize the testbed will be necessary. I will open another issue for this.

  + A first trivial refactor of all test\test_*.py is needed for the automatization:

      * the scripts should have a main()
      * the normal run will be done by
        if __name__ == "__main__":
             main()
      * (the <driver> will run the script by importing the script and calling main)

Original comment by ccanepacc@gmail.com on 30 Mar 2012 at 10:13

Attachments:

GoogleCodeExporter commented 9 years ago

Task:
"""
scripts test_*.py should have a main() and normal (user) run going through
if __name__ == '__main__':
    main()
"""
completed in rev 1173-1176

Original comment by ccanepacc@gmail.com on 31 Mar 2012 at 9:59

GoogleCodeExporter commented 9 years ago

r1177 - 1186:

More info, and code, at (cocos_checkout)/tools/autotest

Status:

    + clock subclasses that allow driving the scripts trough precise timestamps, done.
    + clock variants for both pyglet 1.1.4release and 1.2dev, done.
    + a variant to subprocess.Popen to run scripts with a timeout, done.
    + specification, parsing and validation to describe what are the relevant states to snapshot, mostly done (needs upgrade to handle interactive scripts)
    + a snapshot runner that will exercise a number of scripts, following the desired snapshot plan, collecting snapshots, tracebacks and other failures, done.
    + define and write extractors for the info we need to store about each script, mostly done: scan, change detection, snapshots info, diagnostic covered; human generated info needs to be done
    + information handling support (low level) adequate to the tasks, mostly done, probably will need some additional features
    + A high level, small API to select meaningful subsets of scripts, perform tasks over them, do reports; partially done
    + all scripts in test got initial refactor to cooperate with the snapshot_taker proxy
    + add testinfo (the plan to take snapshots) to each test script: 95 / 193 done, snapshots taken

Working now (toward milestone 1+2)

    > adding testinfo to remaining test
    > expand testinfo spec and support for interactive tests using KBD, other interactions.
    > explore how to handle human assessment

Milestone 2
-----------

Capture complete reference info, including human pass-fail collection, for most 
test scripts (80% ?)

Original comment by ccanepacc@gmail.com on 20 May 2012 at 4:01

GoogleCodeExporter commented 9 years ago

milestone 2 reached:

At this momment (r1204) we have 199 test scripts, of which

   +   3 don't have testinfo and shouldnt, there are not appropiate to autotest
   + 196 have testinfo (ie cmds to the test runner)
   + 188 can be marked as 'pass'
   +   1 can be marked as 'fail'
   +   7 can be marked as 'error'

The 'error' ones are related to some interaction between pyglet main loop, 
clock, taking snapshots and the stepper provided for autotest.
Those will be left to fix in the future, I want to move forward now.

The 'fail' is test_text_movement.py, whichs superficially seems to pass but 
realy is a cheater: it moves changing the component, when it should move by 
changing .position, as the CocosNode that is.
A new issue will be created for this, fix can use autotest to ensure other 
scripts using text don't regress (menus by example)

While going to this milestone some artifacts had been noticed:
    + sometimes an unexpected solid black snapshot
    + sometimes a snapshot will produce an improper .png file (irfan view tells it is not a graphic file at all)
    + sometimes a test will hang and hit timeout, missing to do some snapshots.
    + at one time test_batch2.py produced a different output, maybe by drawing groups in different order

These anomalies where seen very rarely; anyhow I wrote repeteability.py to get 
some stats about this anomalies (will post results in a follow up)

Original comment by ccanepacc@gmail.com on 30 Jun 2012 at 11:01

GoogleCodeExporter commented 9 years ago

The code on tools/autotest was a great help for ensuring behavior does not 
change between python 2.x and 3.x

It was used also in other changes.

Each use makes the soft better, but it needs some reorganization and refactors 
for clarity.

Original comment by ccanepacc@gmail.com on 10 Apr 2014 at 2:31

google-code-export / los-cocos

More automated tests #175