adobe / helix-importer-ui

Apache License 2.0
20 stars 25 forks source link
helix importer scraping

AEM Importer - UI

A collection of tools to support AEM project imports.

PLEASE DO READ THE Importer Guidelines before starting any import.

Usage

Check first the AEM cli installation.

At the root of the project, simply run:

aem import

The import command clones the helix-import-ui repo for you.

Import

In the URL(s) field, give a list of page URLs to be imported (e.g. {https://wwww.host_of_pages_to_be_imported.com/page_1.html}) and hit the import button. The page(s) will be loaded in the central frame and the Markdown transfomation will happen in the right frame. Result of the transformation will be saved as a Word document on your local file system (target folder is asked and tool needs permissions to write).

Options

Crawler

Allows to find URLs on a given host. 2 processes:

Eyedropper

Allows to extract the CSS styles (font, colors) for an give page. Those styles can be used with the https://github.com/adobe/helix-project-boilerplate project.

Cache

When aem import serves content, imported resources can be cached locally. After the first import, the files could be served from local file system. To enable the cache:

aem import --cache .cache/

In the .cache/ folder of the project, you will find all html pages, js, css, images... files requested on the remote host.