{#markdown} for Qute - Githubissues

ia3andy commented 2 weeks ago

This resolve to write two extensions in this repo to allow using {#markdown}##hello{/markdown} and {#asciidoc}==hello{/asciidoc}

There are libraries in Java for conversion.

Before: https://github.com/quarkiverse/quarkus-qute-web/issues/91#issuecomment-2186377726

How could we best allow user to provide markdown (or possibly asciidoc) templates with possibly frontmatter?

@mkouba how could we make this fit in the Quarkus/Qute context?

cc @maxandersen @ebullient

I asked AI to explain what is multi-stage processing in this context:

Multi-Stage Processing in Static Site Generation

Multi-stage processing in the context of static site generation (SSG) generally refers to a structured approach where the generation of the final static site output is broken down into several distinct phases or stages. Each stage performs specific tasks on the content and templates, transforming them step-by-step until the final HTML files are produced. Here’s how these stages typically play out:

Content Loading and Parsing:
- Markdown Parsing: Raw markdown files (often containing frontmatter) are loaded and parsed. Frontmatter, usually in YAML or TOML format, is extracted to provide metadata such as titles, dates, tags, and other custom fields.
- Frontmatter Extraction: The frontmatter is separated from the main content and processed. This metadata guides subsequent stages, like template selection and content organization.
Template Rendering:
- Content Insertion: Parsed markdown content is inserted into predefined templates. These templates define the structure of the pages and often include placeholders for dynamic content.
- Template Engines: Use of template engines (like Handlebars, Jinja, or Liquid) to inject content into the templates. These engines allow for logic such as loops, conditionals, and partials (reusable template snippets).
Static Asset Handling:
- CSS and JavaScript: Compilation and minification of stylesheets and scripts. This might involve tools like Sass for CSS pre-processing or Webpack for bundling JavaScript.
- Image Optimization: Optimization of images to reduce file sizes and improve loading times, often using tools like ImageMagick or specific SSG plugins.
Content Transformation:
- Link Processing: Relative links within the markdown are converted to appropriate URLs. This includes handling internal links between pages.
- Content Enhancements: Additional processing like embedding code snippets, generating table of contents, or adding syntax highlighting for code blocks.
Output Generation:
- Static File Generation: The final HTML files are generated, combining the parsed markdown content with the templates and any additional transformations applied.
- Deployment Preparation: Organizing the output into a directory structure suitable for deployment to a static site hosting service (like Netlify, GitHub Pages, or Vercel).
Build Optimization (Optional):
- Caching: Implementing caching strategies to avoid reprocessing unchanged content.
- Incremental Builds: Building only the parts of the site that have changed since the last build to save time and resources.

Example Workflow

Stage 1: Load and parse markdown files. Extract and process frontmatter.
Stage 2: Render content using a template engine. This might involve loading layouts and injecting content.
Stage 3: ~~Process static assets like CSS and JavaScript~~ => Web Bundler
Stage 4: Perform content transformations such as link rewriting and adding enhancements.
Stage 5: Generate final HTML output and prepare for deployment.

Each stage builds on the previous one, ensuring that by the end of the process, you have a fully-rendered, optimized static site ready for deployment.

Benefits of Multi-Stage Processing

Modularity: Each stage can be developed, tested, and maintained independently.
Scalability: It allows for more complex build processes as the site grows, accommodating more content and features without becoming unwieldy.
Efficiency: Incremental builds and caching can significantly speed up the generation process for large sites.
Customization: Different stages can be customized or extended based on specific project needs, such as adding new transformation steps or integrating additional tools.

By breaking down the site generation process into manageable stages, multi-stage processing provides a clear framework for building static sites efficiently and effectively.

ia3andy commented 2 weeks ago

We have most of the different pieces, it's about finding the right way to make them work together with the right amount of coupling with Qute (or not)

ia3andy commented 2 weeks ago

The outcome the way I see it would be to have:

a new directory src/main/resources/site containing html and markdown files => Serve them with a given layout from the Qute templates on a route configured through FM and/or Quarkus config

To do it, we need a way to have a processing stack on top of Qute which:

process the text, use/edit context (data, Qute resources, ...), return a new text with data

Then a final step which take the output of the stack and create a Route out of it (or just make it available for @Inject).

It seems the only coupling point with any template engine would be: access to the available templates, some way to enable type-safety

ebullient commented 1 week ago

I have one project that allows external injection of templates (i.e. the user can define the output format). Quarkus does (already) have some ability to specify a directory to serve static files from. Given transformation plan.. would be nice to be able to transform content without having to put it in a maven structure at all. i.e. we're building a CLI that will run against content on a file system, not within maven (unless you want to change the behavior of the CLI itself).

Hugo is a little too opinionated, but if you take a loose example of some combination of existing tools, they have a source directory (site) that usually has some sub-directories with loose conventions: layout or _include, and some way of defining static (copy-as-is) vs. content-to-process (content, posts, assets). I think all of those should be external to the CLI source. If someone wants to add additional processors, they can do that (build their own CLI using these extensions and add custom processors), but the content to be munged should be outside of this structure (i.e. you could use a completely separate project, if you wanted, and use a built artifact or a container to transform your content)

ia3andy commented 1 week ago

I guess we could use the Quinoa way where we define a relative directory which doesn't have to be in the resources. ./site by default?

ia3andy commented 1 week ago

Here is the plan.

We need this:

Create qute-markdown and qute-asciidoc extensions (this issue), they would add new sections to Qute {#markdown} and `{#asciidoc}
Add the ability in Qute to generate templates (https://github.com/quarkusio/quarkus/issues/41386)

When https://github.com/quarkusio/quarkus/issues/41386 is done, we can start a new extension (in statiq repo) which:

scan for templates in site root (directories are declared in the config)
parse frontmatter and create a context from it
create a new generated qute template, which {#include the layout from frontmatter and add the relevant {#markdown} if necessary
serve the templateinstance on the url specified by the context data and use the context data for rendering

maxandersen commented 1 week ago

why the need for markdown and asciidoc sections?

I get its useful but normally SSG are driven by the extensions of files to know which render gets applied to the output of a certain template. I.e. I wouldn't even expect qute to necessarily not having to know about it as at that time the template rendering is not even involved?

ia3andy commented 1 week ago

@maxandersen yes it's just internal logic to make it easier, here the frontmatter layout and the md extension would just be a shortcut to a normal qute template:

my-post.md:

layout: foo.html
bar: baz
---

## Hello {bar}

==


{#include foo.html}

{#markdown}
## Hello {bar}
{/markdown}

ia3andy commented 1 week ago

it makes it very easy to deal with and typesafety and all is handled by Qute like any other template..

ia3andy commented 1 week ago

@mkouba in the new builditem, we might need a way to add some data or at least declare it if that's possible (for the frontmatter data)?

mcruzdev commented 1 week ago

I can help with this one

ia3andy commented 1 week ago

@maxandersen we could indeed decouple Qute from markdown, but that would create more coupling with the new tool and Qute because it would mean accessing the templates (for the layout lookup), having some way to type check, and then do partial rendering of the converted markdown to in the end, some way call the layout with the content.

ebullient commented 1 week ago

I think you may want to have a look at what Lume does in terms of processing content through one or more engines:

They do this, as an example (replace vto with qute):

---
title: My post
templateEngine: [vto, md]
---

I would rather do something like this than embed markdown or asciidoc syntax in my templates. You can still have asciidoc/markdown rendering engines as build items, but you don't have to put those markers in your content (or your templates)

maxandersen commented 1 week ago

it makes it very easy to deal with and typesafety and all is handled by Qute like any other template.. thanks for the explantation. but "any other template" excludes buildtime checked does it not ?

or will build time checking still be applied even if the actual template being rendered is dynamically generated?

maxandersen commented 1 week ago

I would rather do something like this than embed markdown or asciidoc syntax in my templates. You can still have asciidoc/markdown rendering engines as build items, but you don't have to put those markers in your content (or your templates)

I was assuming the example adding #asciidoc/#markdown is generated rather than something user put anywhere.

ia3andy commented 1 week ago

I think you may want to have a look at what Lume does in terms of processing content through one or more engines:

...

Not sure if we can keep type-safety if we do that @mkouba wdyt?

maxandersen commented 1 week ago

Not sure if we can keep type-safety if we do that @mkouba wdyt?

how is templateEngine better/worse than using suffix of file to know what render to use?

mkouba commented 1 week ago

@mkouba in the new builditem, we might need a way to add some data or at least declare it if that's possible (for the frontmatter data)?

You mean the new build item that will be introduced in https://github.com/quarkusio/quarkus/issues/41386?

No, that's not possible and it does not fit the current API. You can add param declarations to the template directly but no data can be attached to a template.

ia3andy commented 1 week ago

@ebullient you won't have to put it in your content, from this:

layout: foo.html
bar: baz
---

## Hello {bar}

we generate this (but you don't have to do it yourself)

{#include foo.html}

{#markdown}
## Hello {bar}
{/markdown}

mkouba commented 1 week ago

I think you may want to have a look at what Lume does in terms of processing content through one or more engines:

...

Not sure if we can keep type-safety if we do that @mkouba wdyt?

Sorry I don't have bandwidth to discover what Lume does... but if you have a simple example I can try to help.

ia3andy commented 1 week ago

If we do the processing steps way (what @ebullient suggested), then it means we have to be able to do this:

at build time:

check the types with qute on the markdown file

at runtime run the processing queue (qute, md):

render with qute
render with md

then render the layout with the content.

@mkouba 👆

ia3andy commented 1 week ago

@maxandersen I suppose the processing is defaulted by the file extension and the default template engine

ia3andy commented 1 week ago

The solution I suggested (generated Qute templates) is less flexible but easier to implement.

Less flexible because the template engine will have to be Qute. But you could also have some kind of processing stack, dealt with as a stack of Qute sections.

ia3andy commented 1 week ago

I think we can go with the generated template way, we can always change later if that is needed, the original blog post wouldn't change.

ebullient commented 1 week ago

I meant, no {#markdown} or {#asciidoc} block tags.

*.md.html could imply conversion steps (progressive resolution/rendering of content).

Template engine in Lume allows you to change the processing order: markdown first, then template resolution, then html conversion; or markdown to html ahead of qute. etc.

ia3andy commented 1 week ago

I meant, no {#markdown} or {#asciidoc} block tags.

*.md.html could imply conversion steps (progressive resolution/rendering of content).

Template engine in Lume allows you to change the processing order: markdown first, then template resolution, then html conversion; or markdown to html ahead of qute. etc.

I think I've got your point, we will probably go this way after the MVP, for now we have an easy way forward, the api would be the same in the user project, so we can move to staged processing in the second phase.

In phase 1 (mvp), only the default inferred by the file name (md, asciidoc or html) will be possible and with Qute as the engine.

maxandersen commented 1 week ago

@ebullient the suggestion is NOT requiring the user to use those tags - its an implementation detail in this being able to be done with Qute.

ebullient commented 1 week ago

@ebullient the suggestion is NOT requiring the user to use those tags - its an implementation detail in this being able to be done with Qute.

I guess that is the part I don't understand. Why is that necessary at all? I don't get the relationship or understand why it is necessary.

In the case that I'm the most used to, I do a lot of content generation with Qute, but I suppose I'm treating a lot as strings, and I'm not relying on type safety very much.

FroMage commented 1 week ago

I agree that a pipeline-way of defining what to do with the file is more flexible, though TBH I'm not sure about use-cases for more than just "what template system to use" and optionally "what content filter to use (md/asciidoc)".

I also agree that {#markdown}...{/markdown} and {#asciidoc}...{/asciidoc} tags are useful, and I'll add that they should also define a String.markdown() and String.asciidoc() method extensions, because those are equally as useful.

mcruzdev commented 6 days ago

We can divide this issue in two: markdown (this one) and asciidoc.

[ ] - Markdown https://github.com/quarkiverse/quarkus-qute-web/pull/92
[ ] - Asciidoc

@ia3andy, could you rename this issue?

quarkiverse / quarkus-qute-web

{#markdown} for Qute #91

Multi-Stage Processing in Static Site Generation

Example Workflow

Benefits of Multi-Stage Processing