Add 🚿 life cycle leak test

samccone commented 8 years ago

What:

Add a drool memory test for paper button :scroll: :radio_button:

Do I have leaks¿

No! :tada: this component is all good

node delta	heap delta	event listener delta
0	0 bytes	0

Why

It is quite easy to introduce life cycle memory leaks into components, and is very hard to automatically detect :mag: these kind of regressions. Drool makes it super duper easy to test for these kind of issues in the real world (only 19 LOC in this case) without being @paulirish

Who is even using this?

Lots of people... TodoMVC, Material Design Lite, and others... I promise

Running locally

npm run drool

Does this even work?

Example of a leaking version

screen shot 2015-10-23 at 8 04 04 pm

node delta	heap delta	event listener delta
5	0 bytes	0

Goals

if we land this... we can go ahead and add this code to all of the components.. and roll it into our CI process :tada: .. which means CI will go red if we actually have leaks :fountain: ... It is a magical thing when it works

:bug:

:crocodile:

cdata commented 8 years ago

Thanks for the contribution! Taking a look.

cdata commented 8 years ago

@samccone Thanks for the contribution. drool looks really cool. I love the idea of failing builds when memory leaks are automatically detected.

Do you think it would be possible to express the same sort of detection as a Polymer element? It seems like there is an awesome opportunity here to make a memory leak testing element (even if it only activates in Chrome).

If we did this, we could tie the outcome of the memory leak detection into our existing test runner, WCT. There would be no need to bring in an additional selenium-driving node process to our test suite (yay faster tests!) and we could avoid adding many new npm modules to our dependency graph.

@azakus has suggested to me that we could enable necessary chrome-driver flags for this in WCT quite easily.

WDYT? Shall we make a drool-element?

samccone commented 8 years ago

memory leak testing element

I am not sure exactly how this would work given the tight coupling with selenium and chrome to expose the necessary tracing flags and others https://github.com/samccone/drool/blob/master/lib/index.js#L80

As of 2015 there is no way to get this kind of tracing data from within chrome, so the instrumentation level has to be external ATM.

If we did this, we could tie the outcome of the memory leak detection into our existing test runner

This should be trivial to do ideally... the test runner could just run this node process and check the exit status, that is the way we are currently doing it on other projects that take advantage of this tool.

as far as npm dependencies, drool and polymer-drool are about as lightweight as they come

https://github.com/samccone/drool/blob/master/package.json#L12-L14 https://github.com/samccone/polymer-drool/blob/master/package.json

The idea here (of polymer-drool) being that anyone that has created a polymer element can simply write 5 lines of code and have a working memory regression suite, regardless of whatever testing solution they have in place.

cdata commented 8 years ago

I am not sure exactly how this would work given the tight coupling with selenium and chrome to expose the necessary tracing flags and others

WCT can set any configuration that drool can set. @azakus sounded interested in updating WCT to do whatever configuration would be needed. Am I missing something from the equation?

As of 2015 there is no way to get this kind of tracing data from within chrome, so the instrumentation level has to be external ATM.

This is not entirely true. Chrome's debugging protocol does not need to be accessed by an external process (which is what I was trying to get at above).

This is still pretty rough, but I put together an example of how to make the same measurements as drool, except as an element. Check it out here: https://github.com/cdata/chrome-debugger

I'm not an expert on the debugging protocol. Is there something particular about accessing this information from an external process that makes it more accurate?

The idea here (of polymer-drool) being that anyone that has created a polymer element can simply write 5 lines of code and have a working memory regression suite

Yeah. As cool as that is, I propose that it would be even cooler if they could do it by dropping an element into their test suite. Writing elements in an HTML document is relatively more idiomatic to the WCT workflow than writing external auxiliary JS files.

samccone commented 8 years ago

You are correct If we go through the chrome-debugger protocol we can make requests to get the tracing data out. There are some downsides to using this approach as well as some upsides, selenium gives us a ton of wins .. and when paired with the flags that are currently exposed we get a really nice system (I agree we can get similar wins with the debugger protocol though).

The downside of creating an element to test other elements is that with that approach, we are building upon the system we are testing (a polymer element to test polymer elements), so if there is a leak in polymer core our results would be swayed by our testing mechanism.

When it comes to testing memory leaks and GC thrashing I have found it super important to limit complexity on the test site and trim the fat per say on the test process. By doing this trimming, it makes it really easy to isolate leaks and noise in the system, Adding more layers of abstraction / complexity inside of the testing environment (the browser frame) pushes the pendulum in the other direction from what Drool is currently doing, which is consume logs from chrome and analyze them in an isolated manner from the our testing area. By keeping the browser frame "dumb" we can then blindly kill and restart the browser between tests to ensure we have consistent reproducible numbers --- Something that is critically important to actually arriving at meaning from the results.

There is a slightly higher level concern I have when thinking about building on top of WCT, the result of which would be adding to a monolith that is not portable outside of the existing Polymer ecosystem, as compared to the current PR approach which leverages community libraries and best practices to enable anyone regardless of their CI setup / test setup / project structure to drop in these tests.

cdata commented 8 years ago

When it comes to testing memory leaks and GC thrashing I have found it super important to limit complexity on the test site and trim the fat per say on the test process. By doing this trimming, it makes it really easy to isolate leaks and noise in the system

It's true, I have generally found this to be really difficult to do. I'm actually surprised that drool is able to measure these values reliably without occasional false positives and / or negatives due to this kind of noise. What has your experience with this problem been like so far?

we are building upon the system we are testing

Some might consider this a virtuous cycle.

I think it's reasonable to propose that the way we solve this problem for Polymer core does not need to be the same as how we solve it for elements. Polymer core, for many reasons, warrants special considerations. Elements are a degree removed from Polymer core, so I don't see why we can't use Polymer to build the tools that let us test the elements (so long as we test Polymer to sufficiently remove it from the realm of the element's concern).

By keeping the browser frame "dumb" we can then blindly kill and restart the browser between tests to ensure we have consistent reproducible numbers

Yeah. We need to structure this so that the tests are done in as stripped-down an environment as possible, one that can be setup and torn down with repeatable outcomes. However, I don't necessarily agree that this has to be done from a Web Driver script.

There is a slightly higher level concern I have when thinking about building on top of WCT, the result of which would be adding to a monolith that is not portable outside of the existing Polymer ecosystem, as compared to the current PR approach which leverages community libraries and best practices to enable anyone regardless of their CI setup / test setup / project structure to drop in these tests.

This reads as a cynical dismissal of our testing tool. You are entitled to your opinion about what is right and what is wrong about any tool or process. However, WCT is the purpose-built tool that we use. It's not constructive to write it off as not worth improving. I encourage you to consider how we can improve it, in the context of how it is intended to be used: as a zero-configuration, domain-specific tool that enables robust and easy testing of custom elements.

I think we have a common goal here: make it really easy and delightful for our users (and ourselves) to be able to add these kinds of tests to a test suite. Let's focus on that goal and figure out how to make it happen!

samccone commented 8 years ago

Not intended to read as a dismissal of the tool at all, WCT is a beast and does tons of really useful things, I view the work that this leak detection is doing pretty removed scope wise from the goals of WCT. I think it is possible to integrate this work within WCT, there is just a question of the amount of work needed to get a patch that enables what we need, but that requires more investigation. I will close this PR.

paulirish commented 8 years ago

I'm not sure that doing memory profiling from inside the browser is a great idea. Two tabs can very easily share a process, which is definitely what you don't want for your target and profiler.

Now, one could solve this by using two separate Chrome instances, different profiles and such. But I'm also not sure you'd want to incur the overhead of running a test from clientside javascript, which could then be causing CPU contention against your target.

That plus I'm not sure it's going to be fun to author/port a webdriver library to the clientside. Though it could be done!

cdata commented 8 years ago

Thanks @paulirish, those are all very good points.

Fundamentally, we have a tool that spins up Selenium and drives it from Node for the purposes of testing our web components. This tool is called WCT.

To add to that, we have a basic testing medium: the HTML document. Users are able to fixture DOM in its native habitat and apply TDD/BDD and a11y testing to those fixtures with minimal effort.

Even if driving the profiler from another tab is perilous, it is still possible to configure such profiler tests from the client with elements (or some other WCT-friendly syntax) and have WCT do the dirty work of launching another Chrome instance to run the test. This would yield the same net positive outcome for testers and perhaps save us from the apparent peril of other approaches.

As far as CPU contention is concerned, ultimately these tests are run in Sauce Labs for usage in CI, so isolation of browser instances is theoretically not a problem.

I am skeptical of the following things:

That there is a good reason the discard WCT as a candidate tool that facilitates this type of testing. WCT already composes many community libraries, and in theory Drool could be yet another one that is incorporated.
That our users should need to know how to drive Selenium directly. Such tests are beyond the component-focused scope of WCT (for now).
That the HTML documents we currently use as a test medium should be supplanted by or augmented with external JS tests. HTML is not obviously suited to test configuration, but in my short time exploring this landscape I have already been delightfully surprised and I think others have been as well.

I'm not sure why we would need to author a client-side WebDriver library. Could you clarify what you mean by that?

I am confident that we can come up with something cool together if we put a little creative energy and imagination into this feature. That said, I'm discouraged to find that this issue is now closed.

PolymerElements / paper-button