withastro / roadmap

Ideas, suggestions, and formal RFC proposals for the Astro project.
290 stars 29 forks source link

Images are not collected in Markdown files when using HTML syntax #872

Closed ArmandPhilippot closed 5 months ago

ArmandPhilippot commented 5 months ago

Astro Info

Astro                    v4.5.9
Node                     v18.17.1
System                   Linux (x64)
Package Manager          npm
Output                   static
Adapter                  none
Integrations             none

If this issue only occurs in one browser, which browser is a problem?

No response

Describe the Bug

The issue

Astro seems to support images co-location in content collections only when Markdown syntax is used. When using HTML syntax (ie. an img tag), the image is not processed and result in a 404:

I think the issue comes from those lines in packages/markdown/remark/src/remark-collect-images.ts:

visit(tree, ['image', 'imageReference'], (node: Image | ImageReference) => {
    if (node.type === 'image') {
        if (shouldOptimizeImage(node.url)) imagePaths.add(node.url);
    }
    if (node.type === 'imageReference') {
        const imageDefinition = definition(node.identifier);
        if (imageDefinition) {
            if (shouldOptimizeImage(imageDefinition.url)) imagePaths.add(imageDefinition.url);
        }
    }
});

When using HTML syntax, the node type is html so the image is not collected by Astro. I guess we need to parse the HTML with a regex to collect the src attribute for all images encountered in the node.value.

Steps to reproduce

In addition to the Stackblitz link below, here are the few steps to reproduce the bug:

  1. Create a new project: npm create astro@latest -- --template minimal (answer Yes to all and Strict to Typescript)
  2. Create a folder to store our images and another for our posts in src/content: mkdir -p src/content/{assets,blog}
  3. Put an image in src/content/assets
  4. Create a Markdown file: touch src/content/blog/first-post.md
  5. Add the image in that file with the two syntax (Markdown and HTML):
    
    ![a working image](../assets/image.png)
a failing image
6. Edit `src/pages/index.astro` to load our post:
```astro
---
import { getEntry } from "astro:content";

const entry = await getEntry("blog", "first-post");
const { Content } = await entry.render();
---

<html lang="en">
  <head>
    <meta charset="utf-8" />
    <link rel="icon" type="image/svg+xml" href="/favicon.svg" />
    <meta name="viewport" content="width=device-width" />
    <meta name="generator" content={Astro.generator} />
    <title>Astro</title>
  </head>
  <body>
    <h1>{entry.data.title}</h1>
    <Content />
  </body>
</html>
  1. Launch the server: npm run dev
  2. Open the website in a browser and see the difference between the two images (you can also see a 404 in the terminal).

Additional context

I use a Git submodule to separate the content directory from the website code so I need to colocate the images with Markdown files. In addition, sometimes I want to wrap the images in a figure to add a figcaption so I need to write HTML inside the Markdown file.

What's the expected result?

I should be able to use both ![alt text](./path/to/an/img.jpg) and <img alt="alt text" src="./path/to/an/img.jpg" /> to render an image in a Markdown file.

Link to Minimal Reproducible Example

https://stackblitz.com/edit/astro-markdown-images?file=src%2Fcontent%2Fblog%2Ffirst-post.md

Participation

Princesseuh commented 5 months ago

At this time this is intentional. If you want to use a figure, you still can by putting newlines before and after:

<figure>

![]()

</figure>

should work.

Moving this to the roadmap repo, since this is more of a feature request than a bug