segment-boneyard / nightmare

A high-level browser automation library.
https://open.segment.com
19.55k stars 1.08k forks source link

Use headless Chrome as execution engine instead of Electron #1092

Closed schickling closed 6 years ago

schickling commented 7 years ago

Update: We've released Chromeless which is based on headless Chrome and can run in parallel on AWS Lambda.

Since it's now possible to run Chrome in headless mode I wanted to start a conversation around migrating NightmareJS to Chrome instead of Electron.

This should make it almost trivial to run Nightmare in a serverless context like AWS Lambda and seems to be quite a lot faster based on my tests.

Any thoughts?

schnerd commented 7 years ago

Seems like this is worth investigating.

One comment about AWS Lambda though–deployments are currently limited to 50mb, and Chrome 58 debian package clocks in at 53mb compressed (Electron is 133mb), which might be a blocker. Edit: Supposedly it works, see blow

schickling commented 7 years ago

I'm already running headless chrome on Lambda and it's not a problem. Maybe they've increased the limit? (Definitely annoying to upload ~50mb for every deploy though...)

sandstrom commented 7 years ago

This nightmare-issue is related: https://github.com/segmentio/nightmare/issues/224

Do you want nightmare to run headlessly, or do you want to get rid of Electron?

I think time is better spent helping Electron to support Chromium's headless mode, and then use that for Nightmare. Here is the headless-mode issue in Electron: https://github.com/electron/electron/issues/228

nesk commented 7 years ago

I would rather keep Electron for its great API. I agree with @sandstrom, we should instead help the Electron team to support the headless flag.

avimar commented 7 years ago

I sometimes use the Xvfb setup with x11vnc to watch and manipulate the nightmare instance.

Will that option go away with headless? Or will show:false activate the headless goodness, including not needing an Xvfb screen installed and configured?

DavidRichard2016 commented 7 years ago

watch

cappslock commented 7 years ago

@DavidRichard2016 heads up, there's a button you can click on the right side bar that will let you follow threads without sending other people on the thread notification emails:

image

aendra-rininsland commented 7 years ago

As per my comment here, one interesting idea could be to make the DOM manipulation layer pluggable, which would mean Nightmare itself might become a unifying API for interaction with headless browsers — something I think is really quite possible given the quality of the API and underlying code.

This, in addition to facilitating the creation of a driver for headless Chrome, could also mean the creation of drivers for other browsers like headless Firefox (see: #1141). It might also really push forward Nightmare as a solution for cross-browser unit testing — imagine being able to run tests on a few major browsers, all without ever having to deal with Java dependencies? Nightmare could ultimately become the Karma of browser automation.

codepunkt commented 7 years ago

@sandstrom doesn't seem to be happening. electrons focus is on desktop apps, which don't need headless. it's a shame 😢

joelgriffith commented 7 years ago

I'm personally focused on building something from the ground-up here: https://github.com/joelgriffith/navalia. It has a very similar focus as far as API design goes, but some other cool stuff like a GraphQL front-end and coverage collection. Would love feedback and contributions if anyone is interested!

schickling commented 7 years ago

Hey everyone! We've just open-sourced Chromeless. It's based on headless Chrome and works both locally and on AWS Lambda. The API is pretty similar to NightmareJS.

kensoh commented 7 years ago

Just wrote a post with the list of new entrants to browser automation using headless / visible Chrome - https://medium.com/@kensoh/chromeless-chrominator-chromy-navalia-lambdium-ghostjs-autogcd-ef34bcd26907

casesandberg commented 7 years ago

Go ahead and try out the headless Chrome implementation in v3! https://github.com/segmentio/nightmare/tree/v3.

shellscape commented 7 years ago

I dropped a comment on the PR for this feature but wanted to follow up here as well. If nightmare ditches Electron it will lose not only a fantastic base with an excellent API, but also the differentiating factor (and maybe any relevancy?) against puppeteer which is supported directly by the Chrome Devteam. Not to mention that they've briefly touched on speaking with the folks behind Chromeless about merging efforts. Consolidation of tools for the same platform can be a good thing, but there seems to be little evidence offered as to how this benefits Nightmare as a project beyond a serverless environment. If that's the only benefit, let the folks at puppeteer and chromeless take that crown while Nightmare maintains the benefits of running electron under the hood. If speed is the concern, let them have the speed edge. Personally I want the rock-solid documentation and feature set that electron provides - two things the Chrome Devtools Protocol isn't quite there on yet, not to mention both ever-evolving with each release. Headless Chrome is cool, it's got the wow factor, it's the new hotness, I use it myself for mocha-chrome - but that doesn't mean it's right for this project because it's available.

casesandberg commented 7 years ago

We're still thinking through the tradeoffs and considering how nightmare could work or complement puppeteer and other new, exciting projects.

shellscape commented 7 years ago

@casesandberg I appreciate the quick followup and comment - but there's not much substance there. Would love to see some direct addressing of the bits I outlined in both comments.

keithkml commented 7 years ago

@shellscape you're being rude. The Nightmare devs don't owe you anything. Please check your attitude if you're going to continue commenting here, or go fork and start a competing project.

shellscape commented 7 years ago

@keithkml I'm sorry but I have to disagree. I thanked @casesandberg for his reply and for he speed at which he replied, and that was sincere. Stating that there isn't substance in a reply isn't rude, it's merely a comment on the perception of the content of the reply. I didn't demand anything so there's no implication of any expectation that anything is owed - I very plainly said that I would love to see a more detailed reply - which means I would be stoked. And I'm not trying to bikeshed here either. I'm merely an engaged and interested user. I humbly suggest that there exists a slight oversensitivity to active engagement if that's considered rude.

kensoh commented 7 years ago

The way @shellscape delivered his message may be questionable to some. Nevertheless, I really think he brought up good points and that @casesandberg / core team is probably already weighing some of these considerations before that.

It is not an easy decision though. NightmareJS did the right thing by tearing out PhantomJS and replacing with Electron. We've seen what has happened to PhantomJS by now. At that time, it offers another viable (if not superior) choice for users, besides using PhantomJS QT WebKit engine.

However, now that Chrome Dev team is making their own browser automation tool, there are really situations which only they can make possible, such as quickly implementing a patch upstream to Chromium to resolve some downstream issue faced by Puppeteer. They are also the folks authoring DevTools Protocol. This level of tight integration is unprecedented in web automation space.

Puppeteer team launches the tool and is happy with either outcomes that it becomes a tool users use directly, or a tool which other tools build on to add on higher-level APIs or functionalities specific to their domains. That sounds like a great opportunity to build on top of Puppeteer, eventually. Letting Puppeteer handle the iterations within DevTools Protocol / Chromium / interaction API, while other tools build higher-level features on top of it. Win-win-win.

Also, as DevTools Protocol was originally designed for debugging the browser, there are still much to be iterated for the purpose of automating a web browser. For an extensive API such as NightmareJS it might take pretty long / a lot of work before most existing NightmareJS scripts can run directly in v3. What's gained is that existing assets developed on NightmareJS can be executed on headless Chrome and without Xvbf. What's lost is the unimaginable amount of time to make that happen, as well as the world losing a browser automation tool based on Electron. (I'm making a reasonable assumption that core team will not consider supporting both, that sounds like a nightmare to maintain)

I don't use NightmareJS but actively use CasperJS. As a user, I would rather see the market consolidating to Puppeteer handling Chrome, NightmareJS handling Electron, WebdriverIO handling whatever else. In any case, what Rory @roryrjb pulled off is nothing short of amazing!!

kensoh commented 7 years ago

Another point I would like to bring up. Has anyone notice that increasingly people who look for open-source projects / tools, are only planning to use it (take) but not contribute (give)? Most users come, use something, switch to something else when needed but most don't have intention to contribute PRs let alone eventually maintaining the tool.

Few years ago the forks to stars ratios are much higher, compared to what we see in newer projects. I don’t know is it because there are much more open-source projects now or is it newer folks who join do not have time or the technical skills to contribute / maintain code. I don't know what has changed but this trend looks obvious / worrying to me.

It will encourage open-source projects eventually being created / maintained by a few large commercial entities, versus a much more diversified ecosystem. In any case, we can't change big trends that the world is heading to anyway, we'll only know the reason and the real benefits later.

shellscape commented 7 years ago

@kensoh insightful, interesting, and much appreciated comments on the subject. solid read, and I'm looking forward to additional comments on the thread.

regarding open source contributions; I really think that depends on the space, stability of the project, and accessibility to people that actively encourage contribution. I'd offer up the webpack community as an example (of which I'm a contributor on webpack-dev-server and webpack-dev-middleware, so I've seen this from all sides in that community). When I started using webpack I had zero interest in contributing. the thing is a beast, it's massive, it's intimidating. but as I used it and followed influencers to help educate me about its use, I ran into folks doing evangelism for contribution that were truly passionate about it, and it drew me in. coupling that with my desire to see it work better with koa and I jumped in. I think it's probably worth considering the communities around projects (or lack thereof, not speaking specifically) and how contribution is encouraged. curiously enough, it's worth noting that the entire webpack org and core development on that is supported by both individual and large corporate donations. I don't think we're in any danger of the worst case scenario just yet, but you definitely make an interesting observation.

avimar commented 7 years ago

@kensoh Due to availability bias, my recent memories of attempting to contribute to open source projects has left a bad taste in my mouth.

3 examples:

p.s. this really doesn't seem like the place for this conversation...

kensoh commented 7 years ago

To NightmareJS moderators and guys, I'm sorry, I realised this isn't the place to discuss open-source in general and my question above distracts from the discussion here, please ignore my post above. I think I mentioned that part when I thought of the maintainability challenges should NightmareJS switch to headless Chrome.

(Thanks @shellscape and @avimar for your replies! Wow it is really heartening to see dedicated contributors from diversified backgrounds contributing in different ways. I must have been pigeon-holed in some projects that were slower-moving. Yes I saw many occasions where a merge could have easily happened but didn't due to automated failed test or difficulty in creating the test set matching PR. In theory it would be really great for maintainers to be the ones coming up with the test set as ultimately they are accountable to maintain the code. To have to learn writing test cases in the form required for a project, just to submit a simple PR to improve the tool can be an overkill for many potential contributors. But I'm aware in practice, maintainers time is already thinly spread to various pressing needs of their projects, so that is understandably hard to achieve.)

matthewmueller commented 6 years ago

Hey folks, thanks for voicing your thoughts and concerns.

Under the hood, there's not a lot of difference between these two options. Both electron and headless shell are implementations of Chromium's APIs. Headless chrome is a bit less resource-intensive and doesn't rely on a windowing server, while electron is battle-tested and has more features.

Since there are some really great chrome headless options already available now like puppeteer and chromeless, nightmare is going to focus on being a really solid electron driver.

We may revisit this decision in the future, but for now we're happy with electron.