segment-boneyard / nightmare

A high-level browser automation library.
https://open.segment.com
19.54k stars 1.08k forks source link

npm scripts instead of Makefile #502

Closed fritx closed 7 years ago

fritx commented 8 years ago

https://github.com/segmentio/nightmare/blob/master/Makefile so that we can just run the test on windows seamlessly makefile seems to be unfriendly to windows

"scripts": {
  "test": "mocha --harmony"
}
rosshinkley commented 8 years ago

The makefile buys automatic installation of dependencies if the package has been touched since the last make. With the inclusion of #561, the makefile will also allow headless systems to run the unit test suite (specifically, CircleCI) without internal problems to Electron/Chromium. Mimicking these two behaviors with simple NPM scripts would be difficult to do in a cross-platform friendly way (although not impossible).

I'm in favor of modifying the build process to make it friendlier, and would like to get some feedback on what we should do. Off the cuff, I'd suggest something like Gulp, but that may be a bit heavy-handed for what we ultimately want to accomplish.

Has anyone had trouble building this on Windows or any other OS where the makefile won't work? If so, have you been able to work around it? What does your make process look like? Do you have suggestions for build tools?

Mr0grog commented 8 years ago

This is a little tangential, but #571 made me wonder if the scripts added in #565 should really be part of Nightmare itself instead of the tests. That is, using Nightmare as a testing tool on a CI server somewhere seems like a major use-case. As it stands now, every user of Nightmare will have to know that they need to replicate Nightmare’s testing setup. If Nightmare just did all of that on its own before launching Electron, that would not only solve the deadlock for Nightmare’s own tests but also for other tests that leverage Nightmare.

Doing so would also simplify the test script to the point where we might not need to worry about adding Gulp or some other solution to get things working on Windows.

(Note I have no idea whether this is the problem being encountered in #571; that issue just made me think of this.)

rosshinkley commented 8 years ago

If Nightmare just did all of that on its own before launching Electron, that would not only solve the deadlock for Nightmare’s own tests but also for other tests that leverage Nightmare.

I'm going to purposefully totally ignore the NPM automagic make gives and focus on the framebuffer issues for the moment. I'm tentative on if we should automagically include the "run with xvfb-runner" magic. My main dev box is running both X and Xvfb - what if I want to see the Electron instance? What should Electron run under?

Implementing your proposed changes has other setup complications, I'd think: I would want to preemptively check for an available framebuffer, but I don't know off the top of my head how to do that, especially in a cross-platform-friendly way. I'd love to hear your thoughts on that. Maybe I'm making it way more complicated than it needs to be.

Doing so would also simplify the test script to the point where we might not need to worry about adding Gulp or some other solution to get things working on Windows.

Gulp is nice insofar as it skirts cross-platform issues. The runner task could also be exposed/registered. Introducing the complexity of managing the framebuffer ourselves is unpleasant at first blush, but does also mean avoiding another (large) dependency, and possibly fixing #224 . Which devil is worse? :P

Mr0grog commented 8 years ago

My main dev box is running both X and Xvfb - what if I want to see the Electron instance? What should Electron run under? Implementing your proposed changes has other setup complications, I'd think: I would want to preemptively check for an available framebuffer, but I don't know off the top of my head how to do that.

It might be enough to just be dumber. Maybe an env var like NIGHTMARE_USE_XVFB opts into the automagic. It’s easy enough to set that on your local box or on any CI platform.

rosshinkley commented 8 years ago

@Mr0grog Yeah, this may be a case of me trying to do an awful lot of work to save someone (me, if I can be totally selfish) ten seconds of setup grief.

As time permits, I'll throw together a proposal on running Xvfb internally to Nightmare. Sound good?

Mr0grog commented 8 years ago

Works for me. Rewriting those scripts in JS makes sense, regardless, since it would still have to be done to get rid of the makefile.

rosshinkley commented 8 years ago

I have this (at least partially) working: instead of spinning up an Electron process directly, if the NIGHTMARE_USE_XVFB environment variable is set, Nightmare will spawn xvfb-run, dbus, and finally Electron, akin to how the Makefile runs the test suite now.

The main problem was that sending child.kill() would kill the xvfb instance and leave the Electron process intact. Having an action to make the Electron process exit internally causes the process chain to end. Unfortunately, this breaks the "should kill Electron" test as the child.kill() kills the wrong process. I'm not exactly sure how to make that a valid test. I'm certainly open to suggestions.

Mr0grog commented 8 years ago

Is that https://github.com/rosshinkley/nightmare/tree/502 ?

  1. Maybe start xvfb separately and, once it’s running, then start Electron? This might be better anyway because we need to handle multiple instances of Electron starting/ending/overlapping at different times. I’m not sure if xvfb-run takes care of all that for you or not.
  2. Can you get the PID for Electron somehow? I don’t see anything obvious on xvfb-run’s manpage, but my knowledge of it is pretty rudimentary.
  3. Directly killing Electron is sort of a simple expedient; you could re-write it as an IPC call that asks the Electron process to quit.
rosshinkley commented 8 years ago

@Mr0grog Yes, at least for the moment. I should probably go ahead and create a PR for it, but I know there are still some problematic tests. Anyway!

Maybe start xvfb separately

Well, it is. Kind of. Right now, each new Nightmare instance creates an Xvfb instance as well as an Electron instance (so Xvfb, Electron and Nightmare instances area all 1:1:1). They are created together, and in my opinion, should end together.

Electron starting/ending/overlapping at different times. I’m not sure if xvfb-run takes care of all that for you or not.

I don't know that handling a single Xvfb instance is going to work very well, at least not without the single-instance stuff getting done. I also don't think starting an instance and leaving it running is particularly clean. I'd prefer Nightmare to clean up after itself when it's finished.

I'm going to lump these together:

Can you get the PID for Electron somehow? Directly killing Electron is sort of a simple expedient; you could re-write it as an IPC call that asks the Electron process to quit.

Yes - from within Nightmare - but not without a bit of patching to how .action() works. (It doesn't have an ambient reference to process.) With that, you can bubble the Electron PID up to the parent process. It's ugly.

I don't know you can get the PID from the process tree, either. Maybe? I had problems with killing the process tree (that was my first attempt), but this might be worth a second look.

The current incarnation adds another action to have Electron exit internally rather than killing the process from the parent. This works, but again, breaks the kill test.

Mr0grog commented 8 years ago

Well, it is. Kind of.

I guess I mainly meant: as a separate process. Start the xvfb server, then start Electron so you have two process handles instead of doing it together with xvfb-run. I think (?) you could still maintain 1 nightmare instance : 1 xvfb : 1 electon, even if I wasn’t directly proposing that. This is where my knowledge of what xvfb is doing isn’t deep enough. I don’t know what happens if you try to run two xvfb server instances at once, but was assuming (maybe wrongly) that you couldn’t.

I'd prefer Nightmare to clean up after itself when it's finished.

I wasn’t suggesting that it shouldn’t! Obviously this whole method requires more bookkeeping: whenever an Electron process ends, you’d have to check and see whether any others are running in order to know whether to stop xvfb. For that purpose, though, it could be as simple as a counter.

Mr0grog commented 8 years ago

Side note: I suspect the xvfb-run method will cause problems for dealing with the exit code of the Electron process (https://github.com/rosshinkley/nightmare/blob/502/lib/nightmare.js#L128), since xvfb-run’s manpage makes it sound like the exit code is not preserved.

Mr0grog commented 8 years ago

The current incarnation adds another action to have Electron exit internally rather than killing the process from the parent. This works, but again, breaks the kill test.

If I’m reading right, think that’s just because you’re changing the API. You’d have to update test/files/nightmare-unended.js to pass back the right PID. (nightmare.proc is no longer the Electron process with your changes.)

rosshinkley commented 8 years ago

I don’t know what happens if you try to run two xvfb server instances at once, but was assuming (maybe wrongly) that you couldn’t.

This is ripe for science. I, too am not certain how xvfb (or more to the point, xvfb-run) works under the covers when there are multiple instances. From my very rudimentary testing, it looks like you can, but it also looks like Electron (or any process that requests a framebuffer) picks up the first available display instead of the xvfb instance it is a child process of. This can cause problems like the Electron instance being mid-execution when the buffer suddenly quits. This is where I stopped as I knew that how it worked in my mind was not how it was working in reality, and am going to try to circle back.

Obviously this whole method requires more bookkeeping: whenever an Electron process ends, you’d have to check and see whether any others are running in order to know whether to stop xvfb.

It's not just xvfb but also dbus that would need to be tracked. I also wonder if running them independently but started in series if that's going to have any ill effect? At any rate, I understand what you're getting at. It's worth a look.

... since xvfb-run’s manpage makes it sound like the exit code is not preserved.

I figured as much. It's being wrapped twice. The exit code is an almost guaranteed lie. :P

If I’m reading right, think that’s just because you’re changing the API. You’d have to update test/files/nightmare-unended.js to pass back the right PID. (nightmare.proc is no longer the Electron process with your changes.)

Yes, and I did that locally, but at that moment, I thought it was not in the same spirit without modifying how Nightmare handles interrupts to call the newly-minted Electron self-desctruction mechanism. With a little distance, having the Electron PID bubbled up and having the signal handler call ending, and then using the exit message to catch the xvfb-run instance being destroyed might work and still be in the spirit of the original test.

That said, it won't catch the other processes being orphaned (if they even can be), which gets at your point of running them independently.


With all of that in mind, I think my next steps are to:

  1. experiment with one running instance of xvfb and dbus managed by the Nightmare constructor.
  2. write a test to have two overlapping Nightmare instances where the first ends before the second completes, making sure either shared instances are okay or instances are completely independent
  3. re-run original Xvfb-churn test for independently managed processes approach
  4. make sure that whatever approach is settled on doesn't break the popular suggestions/fixes in #224
rosshinkley commented 8 years ago

Update (aka "So what'd we learn?"):

experiment with one running instance of xvfb and dbus managed by the Nightmare constructor.

You can run the processes independently, but it would appear that Electron picks whatever display is first available when not running as a child process. Unfortunately, this leads back to the same locking issue. I can kill off all of the X servers for a build up front, but that feels incorrect. Also, once in a while - with X started internally to Nightmare - Electron won't "pick up" the framebuffer. I need to spend some time to see if it can be remedied with using xdpyinfo or something to wait for the display to become available. So far, using plain ol' dumb setTimeouts have not been effective, leading me to believe that approach is probably flawed and I've run aground of a different problem.

Again, I need to spend more time with this - it's possible I'm not using Xvfb properly. Open to suggestions.

Also worth noting: this necessitated moving the Electron process start to the Nightmare queue instead of doing it directly in the constructor. The constructor starting Electron has been a long-standing bugbear for me, and I may split that part out into it's own PR. Thoughts?

write a test to have two overlapping Nightmare instances where the first ends before the second completes, making sure either shared instances are okay or instances are completely independent

Works great, provided the right X instance is picked up. The Electron instances share the running X instance just fine.

re-run original Xvfb-churn test for independently managed processes approach

Also works with the same provisions as above.

make sure that whatever approach is settled on doesn't break the popular suggestions/fixes in #224

This would not break anything as it requires an environment variable to be set to be used. Currrently, I have it set up to pick up if Nightmare is running under Circle, but I think I may remove that. I don't want to add new behavior to existing builds that may already account for Nightmare/Electron's special needs.


Another thought crossed my mind as I was playing with moving the Electron start to the Nightmare queue: what about a retry? In the case of a lock, I believe the browser-initialize event does not respond. What if a timeout was propped up around that (say, 5s?) that if hit, would kill the Electron process and restart it? I think that would sidestep the issue as the hang only seldom happens (1 in ~300 starts on my box) and (almost) never happens back to back (again, on my box). This would allow for the npm test to use xvfb-maybe (as suggested here) to run mocha.

I thought I'd at least float the idea for feedback.

Mr0grog commented 8 years ago

You can run the processes independently, but it would appear that Electron picks whatever display is first available when not running as a child process.

Did you try setting the display via app.commandLine.appendSwitch('display', 'displaynum')? (http://peter.sh/experiments/chromium-command-line-switches/#display)

Alternatively, do the various XVFB displays show up in electron.screen? (http://electron.atom.io/docs/api/screen/) If so, we could possibly use that to make sure the window goes to the right display.

The constructor starting Electron has been a long-standing bugbear for me, and I may split that part out into it's own PR. Thoughts?

Sounds good to me.

What if a timeout [for browser-initialize] was propped up around that (say, 5s?) that if hit, would kill the Electron process and restart it?

Sounds not-so-lovely, but it probably would be a reliable end-run around the problem. Sometimes the ugly fix is the one that works best, though. If it seems like the automatically running XVFB stuff is working, I might want to avoid it, personally. But if this is all proving too crazy and problematic, than letting it lie and taking the timeout/retry approach might be the right thing to do.

rosshinkley commented 8 years ago

Did you try setting the display via app.commandLine.appendSwitch('display', 'displaynum')?

I (embarrassingly) had forgotten about the Chromium command line switches. No, I have not, and will carve out some time to give that a whirl.

Alternatively, do the various XVFB displays show up in electron.screen?

I wasn't trusting anything to come out of Electron. I didn't know how reliable the information that fell out would be. I can give this another peek and report back.

But if this is all proving too crazy and problematic, than letting it lie and taking the timeout/retry approach might be the right thing to do.

I'll do one more round of experiments to see if I can nudge this into a working state. The Chromium arguments seem promising, though. Thanks for that!

casesandberg commented 7 years ago

It doesn't look like there has been any progress on this issue lately. I am going to go ahead and close it, but feel free to open it back up if anyone wants to take a stab at a PR.