cobalt-org / cobalt.rs

Static site generator written in Rust
cobalt-org.github.io/
Apache License 2.0
1.37k stars 102 forks source link

[RFC] make tags first citizen of frontmatter #550

Closed Geobert closed 5 years ago

Geobert commented 5 years ago

Tags definition

Current situation

At the moment, tags are under data so they are totally user defined and no rules apply to them. In order to land https://github.com/cobalt-org/cobalt.rs/pull/548, we need to make tags first citizen of the frontmatter (like the categories).

Long term, we want taxonomies (see #553) but we feel that this subset of taxonomies can be introduced without forcing breaking changes later.

Proposal

Declaration

Tags will be populated in the frontmatter as an array of strings:

---
tags: [tag1, tag2]
---

Liquid Variable Access

Tags can be used in the page object like page.tags.

If a page does not have tags, then the tags will not be defined on page. This is in contrast to most parts of Cobalt's design which tries to not have people condition on the existence of something

Permalink

Unlike categories, tags is not an available variable in permalink. This is from the following assumptions

Basic Example Usage

---
tags: [tag1, tag2]
---
{% if page.tags %} 
  {% for tag in page.tags %}
    {{ tag }}
  {% endfor %}
{% endif %}

Pagination

We will support having a top-level page paginated over the list of tags. Each tag will then have a page that is paginated over the pages marked with that tag.

To do this, we are extending #395 in the following ways:

Activation

We are extending #395's frontmatter configuration to add a Tags option to include:

---
pagination: 
  include: Tags
---

Permalink

The pagination-specific permalink will gain the following variables

This will allow us to have

Open Questions:

Paginator

The paginator variable will be extended with:

# paginator to each tag
# - Array of ?? when listing tags
# - nil when listing pages
paginator.indexes
# the current tag 
paginator.index

Open Questions

Example page:

---
pagination:
  include: Tags
  per_page: 10
---
{% if paginator.indexes %}
<ul>
  {% for ptag in paginator.indexes %}
  <li><a href="{{ site.base_url }}/{{ ptag.index_permalink }}/">{{ ptag.index_title }} ({{ ptag.total_pages }})</a>
    {% endfor %}
</ul>
{% else %}
<ul>
  {% for page in paginator.pages %}
  <li><a href="{{ site.base_url }}/{{ page.permalink }}">{{ page.title }}</a>
    {% endfor %}
</ul>
<div class="pagination">
  {% if paginator.previous_index %}
  <span class="left arrow">
    <a href="{{ site.base_url }}/{{ paginator.previous_index_permalink }}">Previous page</a>
  </span>
  {% endif %}
  {% if paginator.next_index %}
  <span >
    <a href="{{ site.base_url }}/{{ paginator.next_index_permalink }}">Next page</a>
  </span>
  {% endif %}

  <div style="width: 100%; text-align: center;">{{ paginator.index }} / {{ paginator.total_indexes }}</div>
</div>
{% endif %}
Geobert commented 5 years ago

It's quite short, but I can't think of anything else to say about tags

epage commented 5 years ago

Some things to include

epage commented 5 years ago

btw thanks for doing this!

Geobert commented 5 years ago

How tags interact with the rest of the SSG for other ones

What's a SSG?

An explicit decision to not include tags in regular permalinks

Added precision

How do we handle untagged items?

Like today if data.tags is not defined. I've added the precision.

Possibly move any of the Index RFC things related to tags here. That'll help keep that RFC smaller since it is sort-of in at this point and we're looking at adding tags generally.

Good point, it's done.

epage commented 5 years ago

What's a SSG?

Static Site Generator

epage commented 5 years ago

Sorry, I edited it away before realizing it was different. I think you had something in the RFC about if a tag is not provided, it won't be present as a page.tags? Is there a reason for that rather than it being an empty array?

Geobert commented 5 years ago

Oh right, I forgot the pagination with no tag, you're right :) I'll fix this after diner ^^

epage commented 5 years ago

I was referring to non-paginator stuff. I still need to pore over the paginator section.

Geobert commented 5 years ago

added

A post without a tag will be stored under a system defined tag "_none". // TODO: to be discussed

It's how the PR is coded today, but we need to discuss this point

EDIT: added the files layout on the drive

Geobert commented 5 years ago

I think you had something in the RFC about if a tag is not provided, it won't be present as a page.tags? Is there a reason for that rather than it being an empty array?

I wanted to keep the same test as today in liquid template:

{% if page.tags %}

I don't even know how to test emptiness of an array in liquid ^^'

epage commented 5 years ago

I wanted to keep the same test as today in liquid template:

{% if page.tags %}

I don't even know how to test emptiness of an array in liquid ^^'

You're right. I thought an empty array is true but it isn't. I keep thinking of python's truthiness rules rather than ruby's which feel much more intuitive to me.

Its annoying to check for empty array right now but I'm slowly working towards https://github.com/cobalt-org/liquid-rust/issues/136 which would allow

{% if page.tags.size != 0 %}

ps I must need more sleep. I got confused between these discussions and some liquid discussions and thought this belonged under liquid. Going to move it back.

epage commented 5 years ago

Besides how trivial the common case is to test for, another consideration is if you just want to iterate on the tags without bothering with an if. I don't know if nil or non-existent values should be iterable.

Geobert commented 5 years ago

I'll start the work this WE to make tags first citizen, pagination aspect is treated in https://github.com/cobalt-org/cobalt.rs/pull/548 anyway :)

epage commented 5 years ago

Should we consider taxonomy support?

In Hugo / gutenberg, they have taxonomies. It took me until this read through to understand their role. They basically let you have multiple categories of tags. For a move, you might tag it by what the movie is like, the genre, the producer, the leads, etc. Instead of having a single tags field for all of this, you can define your own.

Hugo has tags and categories as built in. At least in Cobalt / Hugo, categories are different than tags. Categories express hierarchy while tags are all on the same footing. We also have some built-in knowledge of categories (e.g. auto-assigned based on path)

  1. Should we have taxonomies?
  2. Should only support tag-like taxonomies or tag and category taxonomies?
  3. Do we start with built-in tags and add taxonomies later or do it all now?
  4. If taxonomies can handle categories, do we convert to that now or later?
Geobert commented 5 years ago

I don't know if we should have taxonomies. Personally I won't use it. This RFC was just about moving out tags from data so I can finish the PR https://github.com/cobalt-org/cobalt.rs/pull/548 :)

So I would say, first, make tags first citizen, then if someone wants to add taxonomies, another RFC should be written for that :)

epage commented 5 years ago

So I would say, first, make tags first citizen, then if someone wants to add taxonomies, another RFC should be written for that :)

My concern is if we had taxonomies later, will we break people? We can make the "tags" taxonomy on by default but what about the frontmatter format?

Hmm, while this will complicate the frontmatter code and require some more interesting documentation, maybe we should have "tags" always at the root. This part of the related designs in cobalt where the default is it looks like a simple blog site but you can opt-in to more advanced features (rather than forcing the more advanced features into people's faces)

Geobert commented 5 years ago

As I understand the taxonomy feature, it permits to define a taxonomy let's say tags in the configuration file, and then use it in the front of the post.

If we add this later, we may provide default taxonomies if none is defined for backward compatibility (tags and categories for the current state of our frontmatter).

epage commented 5 years ago

If we add this later, we may provide default taxonomies if none is defined for backward compatibility (tags and categories for the current state of our frontmatter).

My concern was with the frontmatter definition.

If we have a taxonomy, "authors", we probably would do it as

config:

taxonomies: ["authors"]

frontmatter

---
taxonomies:
  authors: ["Foo", "Bar"]
---
Page!

This is in contrast to "tags" and "categories" being in the root. I don't want general purpose taxonomies in the root because we could add a new field and conflict with the user, breaking their site.

But as I said, to scale the site up, it might be reasonable to either have "tags" always be in the root:

---
tags: ["yellow", "green"]
taxonomies:
  authors: ["Foo", "Bar"]
---
Page!

or that "tags" is a shortcut where

---
tags: ["yellow", "green"]
---
Page!

is equivalent to

---
taxonomies:
  tags: ["yellow", "green"]
---
Page!

We also have to deal with the liquid variables. Initially, we'd have page.tags but later page.taxonomies.authors. So in this later scheme, where do tags show up? Both? Do we break the site and move it?

Again, I like the idea of being simple on the surface and scaling to more complex needs, so I'd lean towards both.

Geobert commented 5 years ago

Oh stupid me, I was considering Zola's syntax with just a section [taxonomies] at the root, so I though the user just needed to add that…

Another way would be to provide a tool to migrate (not necessarily in Cobalt's code, maybe a seperate CLI project, or a subcommand as booyaa suggested on gitter)

epage commented 5 years ago

Wait, is this an old suggestion? I'm not seeing it?

The last major breaking change I did provide a command to migrate (and created a PR against all known repos; that was a lot of work). Its just a sticky thing figuring out how to migrate automatically. I was glad to jettison that code asap.

Geobert commented 5 years ago

Wait, is this an old suggestion? I'm not seeing it?

On gitter, booyaa talked about separate binaries to be called as subcommands: https://gitter.im/cobalt-org/cobalt.rs?at=5b87b6dfd8d36815e5ae27bf

And my comment was just an idea I just have :)

I was glad to jettison that code asap.

Yeah that's why any other migration code should be in a separate bin imo :)

Geobert commented 5 years ago

We forgot to discuss this:

A post without a tag will be stored under a system defined tag "_none". // TODO: to be discussed

I have a working implementation with this choice but I don't know if it's the correct choice to do as you have concerns about it and you have way better global view than me on Cobalt :)

epage commented 5 years ago

Taking in mind the decisions we made for tags and taxonomies, I believe we should ignore any page that has no tags. Now, the next question is if tags: []. Let's start off ignoring these as well?

The workaround would be to set a default within the frontmatter. I guess tags: [] being ignored is a way to disable tags when they are set globally. Hmm, I guess tags: ~ would do that too.

I'd say document tags: [] as an open issue and ignore those pages for now.

Geobert commented 5 years ago

I don't know if ignoring is a good solution, or maybe at least output some warnings? I've asked about the matter here: https://github.com/cobalt-org/cobalt.rs/issues/395#issuecomment-433721009

And I got one answer from @berkus who prefers a meta-tag. As for me, I just want a way to know that some posts are without a tag (it may be a mistake to be fixed).

epage commented 5 years ago

When we were considering whether to make unspecified tags not exist, be nil, or [], the root of the decision was in when we support taxonomies, we might have items that use different taxonomies. Just because a taxonomy doesn't apply, we do not know whether it makes sense to have a make it not-exist or be empty. Similarly, I feel, we do not know if it makes sense to have a catch-all or not. That policy lies with the user.

As I mentioned, there is a relatively easy workaround for the user. The user, in the config file, can set a default tag for all frontmatters.

Hmm, one option is if the user's global default frontmatter does not have a tag, we add a tag to it. This gets the best of both worlds.

My suggestion would be either ["Untagged"] or ["untagged"].

Hmm, this makes me realize that we have a discrepancy. We can't turn off tags by setting it to nil. The only way to turn off default tags is by setting it to [] and by doing this we now pass the if page.tags test. Maybe we should have the Builder always set tags to [].

Geobert commented 5 years ago

Good idea for setting ["untagged"] by default if no tags exists in config.

As for deactivating , can't we detect tags: nil apart from nothing? If not, we still can test for empty array and if so, do not include tags in the page object.

epage commented 5 years ago

nil will map to None and None is what we use for "unspecified" meaning, feel free to merge over it.

So your thought is the user specifies [] and we treat it as if no tags was added? That might work out.

epage commented 5 years ago

I have tried to update the proposal with our above discussion and some of the things from #548

I've listed out some open questions.

Geobert commented 5 years ago

What about a default_tag field? So the user can choose which default he wants, or deactivates default tag

epage commented 5 years ago

Why have a default_tag field when we already have default frontmatters in _cobalt.yml where a default tag can be applied to the whole site or to a given collection?

Geobert commented 5 years ago

Oh, true. Forget that ^^'

Geobert commented 5 years ago

{{ index }}: For tags, this will be the current tag

If I understand, I need to get rid of index_title and use this already present field instead? I like the idea, just need confirmation on implementation :)

epage commented 5 years ago

If I understand, I need to get rid of index_title and use this already present field instead? I like the idea, just need confirmation on implementation :)

I basically renamed it because it isn't so much a title anymore.

As noted in the open questions, I dislike the overlap between index / indexes and page_index, next_page_index, etc.

Naming is hard.

Geobert commented 5 years ago

Oh, I didn't see that. Indeed, currently, index is use for the page number we are on. It was called page in my first version, conflicting with already named page which represent a "post" or a "page".

Maybe because English is not my native language, but I'm still lost on our terminology here. Especially for the word "index" which always represent a number in my mind: "index in an array".

Hence index_title but it's not only a title. So we either rename the page number index to something else (index_number?) or we rename index_title to something else than index (index_value?)

Naming is hard.

Oh yeah… I spent days finding a name for my Android app…

epage commented 5 years ago

An index can also be a catalog or some other means of looking something up. For example, text books have an index in the book to look up topics.

index / indexes is to convey "what topic are we providing an index for". When we support published date, categories, etc, it will also effectively be a series of nested topics or, in web site terms, breadcrumbs.

Geobert commented 5 years ago

So maybe renaming current index to reflect the meaning "page number"?

epage commented 5 years ago

Either could be renamed if we could come up with better names.

Maybe page_index be page_num? iirc you had the pages be 1-indexed. page_num I think makes sense in that case.

Geobert commented 5 years ago

Oh, I see the confusion, I forgot to update the RFC T_T current implementation after applying your comments gives us this Paginator:

pub struct Paginator {
    pub pages: Option<Vec<liquid::value::Value>>,
    pub indexes: Option<Vec<Paginator>>,
    pub index: usize, // <-- this is page number
    pub index_title: Option<liquid::value::Value>,
    pub index_permalink: String,
    pub previous_index: usize,
    pub previous_index_permalink: Option<String>,
    pub next_index: usize,
    pub next_index_permalink: Option<String>,
    pub first_index_permalink: String,
    pub last_index_permalink: String,
    pub total_indexes: usize,
    pub total_pages: usize,
}
Geobert commented 5 years ago

I agree on page_num

Geobert commented 5 years ago

update spec for paginator:

pages: The list of posts objects that belong to this pagination page.
page_num: Number of the current pagination page.
total_pages: Total number of pages contained in this paginator.
previous_page_num: Number of the previous page in the pagination. Nil if no previous page is available.
next_page_num:  Number of the next page in the pagination. Nil if there is no next page available.
per_page:   Maximum number of posts or documents on each pagination page.

index: Current index.
indexes: All paginators available, one per index (used in cases like Tags, nil otherwise).
index_permalink: The relative Url path of the current pagination page.
total_indexes: Total number of pagination pages created.
previous_index_permalink:   The relative Url of the previous page. Nil if no previous page is available.
next_index_permalink:   The relative Url of the next page in the pagination. Nil if there is no next page available.
first_index_permalink:  The relative Url of the first page in the pagination.
last_index_permalink:   The relative Url of the last page in the pagination.

trails: The pagination trail structure
    before: 0
    after: 0
Geobert commented 5 years ago

How do we map our current permalink and pagination.permalink to this?

I don't understand the question here :-/

epage commented 5 years ago

A user needs to be able to specify

We could build this in but then how do we handle taxonomies? We don't need to implement taxonomies, but we should have a design that can scale to them so we can try to avoid breaking tag users when we add taxonomies.

Geobert commented 5 years ago

For the moment, it's the first paginator but it's not really important as the only one with indexes field has all the tags, using a dual layout like the example above takes care of that.

But it's true that we can't do another tag list page for the moment. Do you have something in mind? I have absolutely no clue on how to do that. I'm not comfortable yet with permalinks :-/

  • the first page that lists pages for a tag
  • subsequent page lists.

I think both are merged, like the include: All case: testing prev/next provide enough informations to create a navigation system.

Geobert commented 5 years ago

subsequent tag list pages

I though about this again, but I can't think of a use case to have more tags list pages, what's your idea?

epage commented 5 years ago

So right now, we assume tags will not exist if config is None:

---
tags: [tag1, tag2]
---
{% if page.tags %} 
  {% for tag in page.tags %}
    {{ tag }}
  {% endfor %}
{% endif %}

The new liquid would allow us to more easily allow it to be Nil or [] if config is None

---
tags: [tag1, tag2]
---
{% if page.tags != empty %} 
  {% for tag in page.tags %}
    {{ tag }}
  {% endfor %}
{% endif %}

Not sure if we should change for this but wanted to point it out.

epage commented 5 years ago

I though about this again, but I can't think of a use case to have more tags list pages, what's your idea?

I don't necessarily have one. I feel like I'd want to dig into more how others handle these kinds of problems which I've not had the time for lately.

Geobert commented 5 years ago

Now that https://github.com/cobalt-org/cobalt.rs/pull/548 has landed, I'm trying to use it totally and I'm seeing a forgotten use case:

In my index.liquid, I'm listing the posts and print their tags. I can't link the tags to their tags-index permalink:

for the 1st point, maybe it's why it is site-wide configuration in zola?

Geobert commented 5 years ago

I tried to workaround this by trying https://superdevresources.com/tag-cloud-jekyll/ but I need slugify filter ^^'

As said in https://github.com/cobalt-org/liquid-rust/issues/311#issuecomment-458724842 I'll try to do it asap

Geobert commented 5 years ago

Is there anything else to do before closing this?

epage commented 5 years ago

I guess we could close this since we've been spreading pagination of tags between this and #395.