electron / electron

:electron: Build cross-platform desktop apps with JavaScript, HTML, and CSS
https://electronjs.org
MIT License
113.93k stars 15.36k forks source link

Full Crawler Configuration #9350

Closed aight8 closed 7 years ago

aight8 commented 7 years ago

I already tried these: sandbox, contextIsolation My current solution base only by disable nodeIntegration. I have currently the benefit there, that I can disable all unwanted-functions which could harm the crawlers runtime. Like: window.open window.close window.print window.resize... prevent onbeforeunload window.alert/confirm/prompt etc.

What is the cleanest way to disable everything like this - or what is the way to handle all this cases? Currently even with sandbox/contextIsolation enabled they act like normal - and I even can't disable it in preload.js.

Furthermore there are several tricks to get back the original method, so a central solution would be great.

btw: Do do you know where I find a complete list of functions like this.

MarshallOfSound commented 7 years ago

Hey, I don't think anyone can help without some context to this question (not even sure what the question is here) 😆

Full Crawler Configuration

What do you mean by "Full Crawler"? Is it an app, a module, a site?

Need some more context here to understand exactly what you want to know / what the issue is 👍

aight8 commented 7 years ago

Okey let's describe more in details. Electron exposes functions to the renderer process which do several electron app actions. Managing windows/dialogs or print - those are the one I found. However if you use Electron as a crawler and open arbitary pages and js code. All these functions I listed (and maybe more) should be prevented to use entitely. The resize methods even work on the main browser window. Or another one I remember (list is at home): window.abort() or speechSynthese - this one can be disabled by a chromium flag.

To go a step further, I also think about disable any "scrolling" by window methods or anchor clicking.

How you can disable a navigation by form submit, I don't figure it out - all other page navigations can be canceled on electron side.

I want to discuss how you can control the browser as much as possible to prevent any not intended actions caused by the arbitary code.

Another step I tried to face is how even improve the page loading by skipping unneccesary things.

So I got the ad-blocker work in electron.

Other ideas were:

So at the end of the day with all this tweaks the goal is to get a secure and optimized web crawler which is less expensive.

MarshallOfSound commented 7 years ago

GitHub issues are for feature requests and bug reports, questions about using Electron should be directed to the community or to the Slack Channel.

That sounds like an interesting investigation and project but I'm not seeing a feature / bug that would stop be posting the above and closing out the issue. If you're looking for help with things I would suggest the communities linked above.

aight8 commented 7 years ago

Okey thanks, then I will try my best there. Maybe this could result in some electron feature requests. But I don't know exactly if the maybe implemented "headless" chromium mode solves some things here. Let's see.

kevinsawicki commented 7 years ago

But I don't know exactly if the maybe implemented "headless" chromium mode solves some things here.

I'd recommend using headless chrome instead of Electron for a general purpose web crawler, it seems designed for that purpose while Electron is designed for building desktop applications and being able to use node modules in your app.

Closing this out.