Refactor test lifecycles to support parallel operations

svanoort commented 9 years ago

As a pyresttest user I'd like to be able to parallelize the test execution (parallel HTTP calls).

TL;DR Summary of Analysis

Worry about parallelizing network I/O first, then the rest, since it's 95% of time in most cases.
- Remaining overheads is dominated by JSON parsing on extract/validate and curl object creation (caching and curl.reset() will solve that)
Resttest framework methods need to be refactored to isolate parts
- (Re)Configure curl: Function to (re)generate Curl objects for given test (reusing existing if possible)
- Execute curl: curl.perform -- multiplexed by CurlMulti or wrapper on same - gotcha: reading body/header.
- Analyze curl: gather stats, return appropriate result type
- Reduce results: Summarize benchmarks, add to pass/fail summaries, etc
- Control flow: Break from loop if needed.

Need to start working out code for the above.

Precursor: using curl reset when reusing curl handles.

Look at using CurlMulti, see example: https://github.com/Lispython/pycurl/blob/master/examples/retriever-multi.py See also: https://github.com/tornadoweb/tornado/blob/master/tornado/curl_httpclient.py

PyCurl Multi Docs: http://pycurl.sourceforge.net/doc/curlmultiobject.html#curlmultiobject LibCurl: http://curl.haxx.se/libcurl/c/libcurl-multi.html

Using multiprocessing pools for process-parallel execution: http://stackoverflow.com/questions/3842237/parallel-processing-in-python.

Concurrency should be managed at a testset level. Why below.

Some testsets can be run parallel (fetches/gets), some not (creating/updating/deleting)
Multiple testsets CANNOT be run in parallel, otherwise unclear behavior results (testsets may depend on others to do setup/teardown).
If we allow individual tests to define parallel/nonparallel behavior, order of execution & handling of testset becomes complex.
Question: should testsets default to parallel or serial?
- Gotcha: context-modifying tests cannot be safely concurrent
- Decision point: start just accepting user's setting, defaulting to serial.
- Decision part 2: slowly start allowing concurrent default when safety checks for concurrency pass. Narrow down what cannot auto-parallelize over time.

Config syntax:


---
- config:
  concurrency: all  # Maximum, one thread per test run
  concurrency: 1  # single thread, always serial
  concurrency: none  # another way to ensure serial
  concurrency: -1  # yes, this is serial too, as is anything <= 1
  concurrency: 4  # Up to 4 requests at once
  concurrency: 16  # Up to 16 requests at once, if that many tests exist

Implementation: All initial parsing runs, then we decided how to execute (serial or concurrent). For concurrent, I see 4 levels of concurrency, with increasing concurrent resource use and performance, but increasing complexity:

Serial test setup/analysis, parallel network requests
- Generate tests, then execute batches in parallel with CurlMulti and analyze results serially before next batch.
- Execution is done using map(...) calls on functions, very clean.
- Pros:
  - Fairly easy to do (?) with CurlMulti
  - Provides fixed batch execution methods
  - Avoids Process management
  - No worries about synchronization issues with tests themselves
- Con:
  - Timeout is apparently broken on CurlMulti: https://moz.com/devblog/high-performance-libcurl-tips/
  - Load on servers will be "spiky" and not evenly distributed over time (more when all requests fire at once)
  - Reliant on CurlMulti
  - Bottlenecked by serial processing
Parallel execution, process does setup/execute/analyze and returns result
- Each process does a full test/benchmark execution (setup, network call, return)
- Basically do results = pool.map(tests, run_test)
- Multiprocessing makes this easy, minimal code changes vs. current
- Pros:
  - Easy, uses existing methods most effectively
  - Gives a more consistent concurrent load for load testing
  - Fully uses multiple cores
- Con:
  - Synchronization issues with generators, etc
  - Error handling & logging become a bit broken
  - Requires ability to gather all results at once before processing
  - Process management and similar headaches.
  - May not use networking as efficiently as CurlMulti does
  - Bottlenecked by serial processing to some extent
Controller process, in parallel with a concurrent network I/O process
- Controller process generates tests and feeds them to a concurrent network request process, which continuously executes them and then returns results async, which get analyzed by main thread.
- Network I/O uses CurlMulti, single thread does processing
- Pros:
  - Gives a more consistent concurrent load for load testing
  - Network side fully decoupled from test overheads
- Con:
  - More complex than above two (combines them)
Controller process, parallel create/analyze processes, parallel network I/O process
- One controller thread for orchestration which mostly does setup/cleanup
- Tests are generated and analyzed by process pool
- A network I/O execution pool receives curl objects to execute and runs callbacks when they complete so they can be processed.
- Pros:
  - Very efficient
  - Maximum resource use
  - Allows tuning network and CPU bound concurrency separately
  - Very amenable to networked execution, just talk to controller
- Cons:
  - Very complex
  - Needs to be able to continuously feed in work to analysis process pool (orchestrated by controller)
  - Needs

Analysis:

Setup/teardown for benchmark was about 5% overhead.
Network is most of the time, even for purely local calls
TODO: Find out what the overhead is for a normal test, that'll say how much we need to parallelize this.

Test overhead:

I did: time python profile_basic_test.py
For github_api_test, it took: 1.189s realtime, cProfile says 1.041s was curl.perform() (about 10% overhead, including test I/O and parsing)
- On a second run, with cumtime: real time 0.991s, cProfile time: 0.855s, curl.perform time: 0.837s, read/parse YAML for test file: 0.014s
- Test overhead: 0.004s (4 ms), 0.5% overhead, 2ms overhead per test
- Suspect that JSON parsing is primary overhead
For content-handler test with testapp (fully local call, minimal work), doing cumtime:
- 'time' says realtime 0.417s, cProfile says 0.282s total runtime, 0.234s was curl.perform (18% overhead).
- Overhead had 0.040s reading/parsing test file from YAML to object
- Total overhead of testing: 0.008s (8 ms). Not bad for local calls with templating and file read, about 3-4% overhead for test running.
- Since this is actually 19 tests(!), overhead is around 0.5ms
Conclusion: barring schema/YAML loading, overhead is <5% for normal tests

Decision Point:

Start with refactoring for 1, but with eye to doing 3/4 in the end and building something sane for that (isolating logging)
Look at actor model as a possible implementation approach?
Look at how functional languages do this (our implementation should be guided by this) for sanity's sake.
- Gear toward map/reduce/filter implementations and limit mutation

svanoort commented 8 years ago

Simple code example: https://fragmentsofcode.wordpress.com/2011/01/22/pycurl-curlmulti-example/

No need to worry overmuch about true full-parallel. Single executor thread with CurlMulti (concurrent networking) + analyze batch --> final result.

Use a work queue and worker processes if needed to handle/interpret responses (combined again by main thread into results).

AndrewFarley commented 6 years ago

I know this issue is now years old, but I'd like to throw a +1 into the ring. With the current trend in APIs is microservices, there's a huge need arising which is the need to parallelize testing of those APIs. Currently, with a traditional "setup, do test, tear-down" approach to testing microservice APIs for larger APIs can take a (relatively) long time. PyRestTest helps in that regard and allows passing objects/artifacts/values from one to the next to have simplified shared dependencies, but it (currently) runs tests in a non-concurrent fashion. With a bit of effort, each test could be pre-loaded and evaluate what it's dependency is to generate a dependency graph, and the optimize by performing the tests with the most dependencies first, so that its dependents could fire upon its completion. This would allow for full parallelization of the tests, and in the microservice world, could allow for a full barrage of tests to complete in seconds, not minutes/hours.

Imagine the velocity gained by this approach. I've spent a few days googling for an API testing framework (or a testing framework in general) which can do this, and I haven't found any. But, this framework comes awfully close to being able to do this by allowing artifacts/values from previous steps to cascade into the next tests, and by doing things like supporting saving outputs from steps, so that we can do a cleanup afterwards based on those outputs.

I guess I don't see a ton of movement on this project lately, but if it's still active and if anyone's interested, this might be something I would approach doing a PoC of with this framework since I really see the "future" in microservices will require test parallelization.

svanoort / pyresttest

Refactor test lifecycles to support parallel operations #31