mhaack / helix-importer

Foundation tools for importing website content into that can be consumed in an Helix project.
Apache License 2.0
0 stars 3 forks source link

Enable building a node module to have direct access to `md2jcr` function #22

Open catalan-adobe opened 4 months ago

catalan-adobe commented 4 months ago

Is your feature request related to a problem? Please describe. In context of ESaaS (Experience Success as a Service) initiatives we (@karlpauls, @cziegeler and @catalan-adobe) are working on a demo tool which automates demo steps to help field build EDs in an easier and faster way. One of the most feature we got requested early on was the support for Crosswalk project. One of the demo step being about importing original content into EDs project, this means for crosswalk we the JCR conversion features you are currently working on.

In the tool we have 2 types of import:

  1. Single page import: we orchestrate a modified version of the helix-importer-ui where users can extract content in a custom way without having to write a custom script.
  2. Bulk page import: we are interested in showing to customer the capability to easily ingest their content into EDs world. For this we use a CLI which executes a default import on a set of URLs (no helix-importer-ui involved).

Looking at https://github.com/mhaack/helix-importer-ui/tree/html2jcr and https://github.com/mhaack/helix-importer/tree/html2jcr, we concluded that the function we need to do a full conversion to JCR is md2jcr (https://github.com/mhaack/helix-importer/blob/html2jcr/src/importer/html2jcr/index.js#L35). (remark: on our side, we already know how to generate the Markdown content needed for the function)

The missing link is having direct access to a node compatible version of md2jcr (which is not exported in the project).

Describe the solution you'd like Add a mechanism to build a node.js library which exports the function.

Describe alternatives you've considered Use the already existing helix-importer.js built by default in https://github.com/mhaack/helix-importer-ui/tree/html2jcr. Though this is not optimized because:

  1. it builds a browser compatible library that would force us to use in an orchestrated browser (Puppeteer).
  2. the library contains way more code than what we need

Implementation This feature can be tested in https://github.com/catalan-adobe/aem-bulk-cli/tree/esaas-demo-tool-issue-73 (notice file vendors/helix-importer-md2jcr.js which got build using the npm script proposed in the draft PR) To try it out:

node index.js importer md2jcr --log-level silly --md-file <PATH_TO_A_MD_FILE> --components-path <PATH_TO_FOLDER_CONTAINING_COMPONENTS_FILE> --output-folder .