Steveorevo / node-red-contrib-nbrowser

Provides a virtual web browser (a.k.a. "headless browser") appearing as a node.
34 stars 13 forks source link
automation browser electron scraping

node-red-contrib-nbrowser

Provides a virtual web browser (a.k.a. "headless browser") appearing as a node. The web browser is based on the open source electron.atom.io and nightmarejs.org projects. The node edit panel provides the ability to easily navigate and automate most web browser operations as well as display an interactive window for easy debugging. In headless mode the browser omits downloading images and is highly optimized for speed and performance. This makes testing and automation incredibly fast and versatile. Extended features include the ability to inspect DOM elements, upload & download files, and answer common dialogs.

By default, a node will use or create the web browser "instance" in msg.nbrowser property and will have a pre-added goto method to navigate to the URL from an incoming msg.payload. After methods have been applied, the resulting HTML source is output in the msg.payload. Developers have the option of analyzing content in their flow before navigating or taking additional actions on a given page by simply dropping additional nbrowser nodes in their flow diagram. When used with the node-red-contrib-string node and the switch node, nbrowser enables an unprecedented level of versatility and functionality to the already powerful Node-RED set of capabilities.

Example

In this example, we can see a flow using nbrowser in conjunction with the credentials node to login to a WordPress' admin dashboard. The setting for the node shows the methods for gotoURL, type, click, and wait. The default results are the HTML source code for the current web page; which passes on to a Node-RED html node. The html node isolates the .update-count element's content and reveals it in the debug output.

Example

// WARNING: nbrowser is NOT sandboxed & could allow code injection by a
// malicious website owner. DO NOT use nbrowser with untrusted websites.

This might be useful for:

Please don't use this for:

Installation

Run the following command in your Node-RED user directory (typically ~/.node-red):

npm install node-red-contrib-nbrowser

The nbrowser node will appear in the palette under the advanced group.

Note: You can install nbrowser using Node-RED's Manage Palette -> Install option; however, the current version requires that you restart the Node-RED server.

Options

Show browser window instance?

Displays the Electron browser instance to aid development. Use Command (Mac) / Ctrl (Windows) + Shift + I to display developer tools & the DOM inspector. Note: image downloading is suppressed in headless mode but is turned on for convenience when this option is enabled.

Close instance after methods?

IMPORTANT! Use this option to destroy the browser instance after processing all methods or place an nbrowser node at the end of your flow with this option enabled to close an existing instance window. This is especially important when running in headless mode; otherwise it is not physically possible to close the window.

Ignore SSL certificate errors? This option can be used to continue loading a web page when the given certificate is untrusted. Commonly used to allow self-signed certificates that are used during development.

Methods

The following methods can be used to navigate or analyze a given web page. Many of the methods below require a valid CSS selector to take effect. Unlike NightmereJS or PhantomJS, nbrowser will wait for the given CSS selector to appear; eliminating the need to insert additional wait methods. When present, a selector parameter may come from a literal string or any property of global, flow, or msg contexts. Selectors that fail to appear will generate a catchable timeout error.

Additional parameters may include a flow output or a context property to store an existing value. If a flow output is selected, the resulting value is often delivered in the msg.payload context property.


check

Used to check or uncheck a checkbox.

Utility Functions

removeTargetAttr - is a utility function that is callable from the evalJavaScript method. This function is useful to prevent a new window from spawning when clicking an anchor link. Instead, the existing browser window will be used for navigation.

msg.nbrowser_delay - The delay between nbrowser nodes within the same flow; default is 1500ms but can be changed via the input to nbrowser. Increase this value to avoid navigation error -3 and ERR_ABORTED.

References

The method operations are derived from open source NightmareJS project. The official node-red-contrib-nbrowser project on GitHub.