vgalin / html2image

A package acting as a wrapper around the headless mode of existing web browsers to generate images from URLs and from HTML+CSS strings or files.
MIT License
344 stars 44 forks source link
chrome chromium chromium-browser css headless-browser html html2image python python3

html2image logo

![PyPI](https://img.shields.io/pypi/v/html2image.svg) ![PyPI](https://img.shields.io/pypi/pyversions/html2image.svg) ![PyPI](https://img.shields.io/github/license/vgalin/html2image.svg) ![GitHub](https://img.shields.io/github/v/release/vgalin/html2image?include_prereleases) ![GitHub](https://img.shields.io/github/languages/code-size/vgalin/html2image) |[PyPI Package](https://pypi.org/project/html2image/)|[GitHub Repository](https://github.com/vgalin/html2image)| |-|-| **A lightweight Python package acting as wrapper around the headless mode of existing web browsers, allowing image generation from HTML/CSS strings, files and URLs.**

 

This package has been tested on Windows, Ubuntu (desktop and server) and MacOS. If you encounter any problems or difficulties while using it, feel free to open an issue on the GitHub page of this project. Feedback is also welcome!

⚠️ Disclaimer: Use this package with trusted content only. Processing untrusted or unsanitized input can lead to malicious code execution. Always ensure content safety.

Principle

Most web browsers have a Headless Mode, which is a way to run them without displaying any graphical interface. Headless mode is mainly used for automated testing but also comes in handy if you want to take screenshots of web pages that are exact replicas of what you would see on your screen if you were using the browser yourself.

However, for the sake of taking screenshots, headless mode is not very convenient to use. HTML2Image aims to hide the inconveniences of the browsers' headless modes while adding useful features, such as allowing the creation of images from simple strings.

For more information about headless modes :

Installation

HTML2Image is published on PyPI and can be installed through pip:

pip install --upgrade html2image

In addition to this package, at least one of the following browsers must be installed on your machine :

Usage

First, import the package and instantiate it

from html2image import Html2Image
hti = Html2Image()

Multiple arguments can be passed to the constructor:

Example:

hti = Html2Image(size=(500, 200))

You can also change these values later:

hti.size = (500, 200)

Then take a screenshot

The screenshot method is the basis of this package. Most of the time, you won't need to use anything else. It can take screenshots of various things:

And you can also (optional):

N.B.: The screenshot method returns a list containing the path(s) of the screenshot(s) it took.

A few examples

hti.screenshot(html_str=html, css_str=css, save_as='red_page.png')


- **HTML & CSS files to image**
```python
hti.screenshot(
    html_file='blue_page.html', css_file='blue_background.css',
    save_as='blue_page.png'
)

Click to show all the images generated with all the code above sample_url_to_img.png sample_strings_to_img sample_files_to_img sample_other_to_img sample_other_50_50

N.B. : the output path will be changed for all future screenshots.


Use lists in place of any parameters while using the screenshot method

create three files from from different filenames

hti.screenshot(html_str=['A', 'B', 'C'], save_as=['A.png', 'B.png', 'C.png'])

outputs A.png, B.png, C.png

- Take multiple screenshots with the same size
```python
# take four screenshots with a resolution of 100*50
hti.screenshot(
    html_str=['A', 'B', 'C', 'D'],
    size=(100, 50)
)

- Apply CSS string(s) to multiple HTML string(s)
```python
# screenshot two html strings and apply css strings on both
hti.screenshot(
    html_str=['A', 'B'],
    css_str='body {background: red;}'
)

# screenshot two html strings and apply multiple css strings on both
hti.screenshot(
    html_str=['A', 'B'],
    css_str=['body {background: red;}', 'body {font-size: 50px;}']
)

# screenshot one html string and apply multiple css strings on it
hti.screenshot(
    html_str='A',
    css_str=['body {background: red;}', 'body {font-size: 50px;}']
)

paths = hti.screenshot(
    html_str=['A', 'B', 'C'],
    save_as="letters.png",
)

print(paths)
# >>> ['D:\\myFiles\\letters_0.png', 'D:\\myFiles\\letters_1.png', 'D:\\myFiles\\letters_2.png']

Change browser flags

In some cases, you may need to change the flags that are used to run the headless mode of a browser.

Flags can be used to:

You can find the full list of Chrome / Chromium flags here.

There are two ways to specify custom flags:

# At the object instanciation
hti = Html2image(custom_flags=['--my_flag', '--my_other_flag=value'])

# Afterwards
hti.browser.flags = ['--my_flag', '--my_other_flag=value']

With Chrome / Chromium, screenshots are fired directly after there is no more "pending network fetches", but you may sometimes want to add a delay before taking a screenshot, to wait for animations to end for example. There is a flag for this purpose, --virtual-time-budget=VALUE_IN_MILLISECONDS. You can use it like so:

hti = Html2Image(
    custom_flags=['--virtual-time-budget=10000', '--hide-scrollbars']
)

hti.screenshot(url='http://example.org')

For ease of use, some flags are set by default. However default flags are not used if you decide to specify custom_flags or change the value of browser.flags:

# Taking a look at the default flags
>>> hti = Html2Image()
>>> hti.browser.flags
['--default-background-color=000000', '--hide-scrollbars']

# Changing the value of browser.flags gets rid of the default flags.
>>> hti.browser.flags = ['--1', '--2']
>>> hti.browser.flags
['--1', '--2'] 

# Using the custom_flags parameter gets rid of the default flags.
>>> hti = Html2Image(custom_flags=['--a', '--b'])
>>> hti.browser.flags
['--a', '--b']

Using the CLI

HTML2image comes with a Command Line Interface which you can use to generate screenshots from files and URLs on the go.

The CLI is a work in progress and may undergo changes. You can call it by typing hti or html2image into a terminal.

argument description example
-h, --help Shows the help message hti -h
-U, --urls Screenshots a list of URLs hti -U https://www.python.org
-H, --html Screenshots a list of HTML files hti -H file.html
-C, --css Attaches a CSS files to the HTML ones hti -H file.html -C style.css
-O, --other Screenshots a list of files of type "other" hti -O star.svg
-S, --save-as A list of the screenshot filename(s) hti -O star.svg -S star.png
-s, --size A list of the screenshot size(s) hti -O star.svg -s 50,50
-o, --output_path Change the output path of the screenshots (default is current working directory) hti star.svg -o screenshot_dir
-q, --quiet Disable all CLI's outputs hti --quiet
-v, --verbose More details, can help debugging hti --verbose
--chrome_path Specify a different chrome path
--custom_flags Specify custom browser flags
--temp_path Specify a different temp path (where the files are loaded)


... now within a Docker container !

You can also test the package and the CLI without having to install everything on your local machine, via a Docker container.

Inside that container, the html2image package as well as chromium are installed.

You can load and execute a python script to use the package, or simply use the CLI.

On top of that, you can also use volumes to bind a container directory to your local machine directory, allowing you to retrieve the generated images, or even load some resources (HTML, CSS or Python files).

Testing

Only basic testing is available at the moment. To run tests, install the requirements (Pillow) and run PyTest at the root of the project:

pip install -r requirements-test.txt
python -m pytest

FAQ


If you see any typos or notice things that are oddly said, feel free to create an issue or a pull request.