lumeland / lume

🔥 Static site generator for Deno 🦕
https://lume.land
MIT License
1.79k stars 77 forks source link

Include file slug value in `Page.src` object #278

Closed paulrobertlloyd closed 1 year ago

paulrobertlloyd commented 1 year ago

Lume does some calculations behind the scenes to determine information about the source of a generated page. For example, given a file saved at /articles/2022-08-29_porto.md, Lume provides the following data in Page.src:

{
  path: "/articles/2022-08-29_porto",
  lastModified: 2022-10-01T17:46:49.762Z,
  created: 2022-10-01T17:16:39.480Z,
  remote: undefined,
  ext: ".md"
}

Lume will also add a page.data.data value if the base file name contains YYYY-MM-DD_ (so for the above this would be calculated to be 2022-10-01T00:00:00.000Z).

However, there is one other piece of information that may be useful to site authors: the file slug. In the case of the above, this value would be porto.

This value could be used in url() functions:

export function url(page) {
  return `./example/${page.src.fileSlug}/`;
}

It could also be used by page (pre)processors. This is my use case, where I am currently having to calculate this value myself to generate a new URL.

(Eleventy provides a similar fileSlug value).

oscarotero commented 1 year ago

You can get the file path without the date in page.dest.src.

To set a Url based on the slug value, you can do this:

import { basename } from "lume/deps/path.ts";

export function url(page) {
  const slug = basename(page.dest.path);
  return `./example/${slug}/`;
}
paulrobertlloyd commented 1 year ago

Hmmm, seems like this doesn’t work in a processor where page.updateDest is being used, as the value returned has already been transformed.

This is what I’m trying to do. I want to change the URL of posts to use the following format:

/{:year}/{:dayofYear}/{:type_prefix}{:post_index}[/{:slug}]

If a post has a title, the source file slug will appear at the end of the URL. So, for example:

Article  /2022/123/a1/my-great-post
Note     /2022/123/n1
         /2022/123/n2
         /2022/123/n3

Because I need to calculate how many posts of the same type have been published in a given day of the year, I have a preprocessor that gets that number, then uses page.updateDest() to update the page destination with the desired URL format :

import { DateToSxg } from "npm:newbase60";
import { dayOfYear, format } from "https://deno.land/std/datetime/mod.ts";
import { posix } from "https://deno.land/std/path/mod.ts";

import { fileSlug, isSameDay, sortPages } from "./utils.js";

export async function uid(page, pages) {
  let type_index;
  let prev;

  pages.sort((p1, p2) => sortPages(p1, p2)).map((page) => {
    let { date, layout, title, type_prefix, type } = page.data;

    // Only update destination URL if page is a post
    if (layout === "index.tmpl.js" || layout === undefined || !type) {
      return;
    }

    // Get number of times this post type has been published on same day
    type_index = isSameDay(prev, date) ? type_index + 1 : 1;
    prev = date;

    // Calculate values for year and day of the year (YYYY-DDD)
    const year = format(date, "yyyy");
    const day = String(dayOfYear(date)).padStart(3, "0");

    // If post has a title, use provided slug or that derived from file name
    const slug = title ? `${fileSlug(page)}/index` : "index";

    // Update destination to use NewBase60 URL
    page.updateDest({
      path: `/${year}/${day}/${type_prefix}${type_index}/${slug}`,
      ext: ".html",
    });

    // Add post ID to page data
    page.data.id = `${type_prefix}${DateToSxg(date)}${type_index}`;
  });
}

Using the suggestion above, I am unable to get the original slug as it’s been updated by this loop. Therefore it’d be useful to have a src.fileSlug value that can’t be modified in this way.

(I’ve also tried to use a pre-processor to only generate the type_index value, then use this value in url() functions for page data, but for some reason this value isn’t passed through, possibly as it’s provided too late? Suggestions welcome!)

oscarotero commented 1 year ago

(I’ve also tried to use a pre-processor to only generate the type_index value, then use this value in url() functions for page data, but for some reason this value isn’t passed through, possibly as it’s provided too late? Suggestions welcome!)

The url() functions are executed before the preprocessors.

A solution could be saving the slug property the first time it's calculated:

if (!page.data.slug) {
  page.data.slug = basename(page.dest.path);
}

If you don't want to expose this value to the layouts, there's a page._data property, used by some plugins to store private data in the page.

paulrobertlloyd commented 1 year ago

The url() functions are executed before the preprocessors.

Ah, of course. So to be clear, the only way to build a URL that includes a post count is using a preprocessor and using page.updateDest(), as I described above?

A solution could be saving the slug property the first time it's calculated

Where would you do this? Any attempt to derive a slug from the original source file falls foil of the issue described above; I can’t seem to get a value for the original file slug if I’m using page.updateDest().

oscarotero commented 1 year ago

So to be clear, the only way to build a URL that includes a post count is using a preprocessor and using page.updateDest(), as I described above?

Yes. I'd like to have a more elegant solution (maybe editing the url parameter), but for now this is the best way. Any idea about this is very appreciated.

Where would you do this?

This value could be calculated in a different preprocessor, run before your code:

site.preprocess([".html], (page) => {
  if (!page.data.slug) {
    page.data.slug = basename(page.dest.path);
  }
});

This store the slug in a variable that you can use in the next preprocessors. It's calculated the first time, before the page.dest.path change and stored in page.data.slug. The if statement avoid recalculate it again (until the page change, if you're on --watch or --serve).

paulrobertlloyd commented 1 year ago

I’m still not sure this helps, and I think I know why. Because the URL is calculated before any preprocessor runs, and because, with pretty URLs enabled, base names are considered to be index, the destination value is no longer useful in this regard. For example:

page.src.path  = /articles/2022-03-10_my_example_post.md
basename(page.src.path) = 2022-03-10_my_example_post

page.dest.path = /articles/my_example_post/index.html
basename(page.dest.path) = index

So there’s currently no easy way to get the original file slug, without applying a regex to page.src.path to remove any dates. The same is true also true for page generators, whose base name includes the page index (i.e. articles[2]).

oscarotero commented 1 year ago

Ok. I'll think of a way to include the slug in the Page instance (I'm not not sure if it should be part of page.src or page.data). Meanwhile, you could do something like this:

import { basename, dirname } from "lume/deps/path.ts";

site.preprocess([".html], (page) => {
  if (!page.data.slug) {
    let slug = basename(page.dest.path);

    if (slug === "index") {
      slug = basename(dirname(page.dest.path));
    }
    page.data.slug = slug;
  }
});
paulrobertlloyd commented 1 year ago

Thanks for this, that works for now (bit better than my regex!)

I do think this value should be exposed somewhere, after all it’s a user defined piece of data, much like the date in the file name.

I wouldn’t mind this value being provided as a default value for page.data.slug, or even page.src.path, but you have a better idea of the overall architecture and project direction.

oscarotero commented 1 year ago

This will be available in Lume 1.13.0. Closing this.