cobalt-org / cobalt.rs

Static site generator written in Rust
cobalt-org.github.io/
Apache License 2.0
1.37k stars 102 forks source link

[RFC] Pagination feature #395

Open Geobert opened 6 years ago

Geobert commented 6 years ago

Feature to paginate a collection of documents so we can avoid a landing page which is thousands of kilometer long :p

Requirements

Initial

Index pages should be able to index by

Index pages should be able to sort by multiple factors at once:

Index page behavior

Index page permalinks

Potential

Narrowing and nesting

Unplanned

Index pages should be able to index by

Proposal

Activation

This feature is activated in the frontmatter.

Default values and use shortcut include to activate indexing:

pagination:
  include: Categories // default: `None`, can also be `All`, `Tags` `Dates`, etc
  per_page: 10
  permalink_suffix: "./{{index}}/{{ num }}/"
  order: Desc
  sort_by: ["weight", "published_date"]
  trails:
    before: 0
    after: 0

Also, we can make the schema auto-adapt

User-defined defaults can be set in _cobalt.yml by filling in the default, pages.default, or posts.default pagination field, just leaving include as None to avoid activating it for all pages.

Permalink

This permalink attribute is not to be confused with the one outside pagination section. It defines the location of the generated indexes.

New permalink attributes specific to this context

Paginator

Pagination would be accessible through a paginator object (total ripoff from jekyll's pagination v2):

pages: The list of posts objects that belong to this pagination page.
total_pages: Total number of pages contained in this paginator.
per_page: Maximum number of posts or documents on each pagination page.

index: Current index.
indexes: All paginators available, one per index (used in cases like Tags, nil otherwise).
index_permalink: The relative Url path of the current pagination page.
total_indexes: Total number of pagination pages created.
previous_index_permalink:   The relative Url of the previous page. Nil if no previous page is available.
next_index_permalink: The relative Url of the next page in the pagination. Nil if there is no next page available.
first_index_permalink: The relative Url of the first page in the pagination.
last_index_permalink: The relative Url of the last page in the pagination.

trails: The pagination trail structure
    before: 0
    after: 0

Once activated, a paginator object replace collection object in liquid template.

Tags

Moved to RFC https://github.com/cobalt-org/cobalt.rs/issues/549

Publication Date

Open Question

Prior Art

Jekyll

Deprecated method

https://jekyllrb.com/docs/pagination/

Activation in _config.yml:

paginate: 5
paginate_path: "/blog/page:num/" 

Quoting the page:

This will read in blog/index.html, send it each pagination page in Liquid as paginator and write the output to blog/page:num/, where :num is the pagination page number, starting with 2. If a site has 12 posts and specifies paginate: 5, Jekyll will write blog/index.html with the first 5 posts, blog/page2/index.html with the next 5 posts and blog/page3/index.html with the last 2 posts into the destination directory.

Pagination V2

https://github.com/sverrirs/jekyll-paginate-v2/

Advanced sorting:

sort_field: 'author:born'

pagination trails:

pagination:
  trail: 
    before: 2 # The number of links before the current page
    after: 2  # The number of links after the current page

Gutenberg

Paginator object: https://www.getgutenberg.io/documentation/templates/pagination/

A paginated section gets the same section variable as a normal section page. In addition, a paginated section gets a paginator variable of the Pager type:

// How many items per page
paginate_by: Number;
// Permalink to the first page
first: String;
// Permalink to the last page
last: String;
// Permalink to the previous page, if there is one
previous: String?;
// Permalink to the next page, if there is one
next: String?;
// All pages for the current page
pages: Array<Page>;
// All pagers for this section, but with their `pages` attribute set to an empty array
pagers: Array<Pagers>;
// Which page are we on
current_index: Number;

To activate pagination on a page: https://www.getgutenberg.io/documentation/content/section/

Tags and categories management: https://www.getgutenberg.io/documentation/content/tags-categories/

Hugo

https://gohugo.io/templates/pagination/

In configuration:

Paginate=10
PaginatePath=page

Activation in page:

{{ $paginator := .Paginate (where .Data.Pages "Type" "post") }}
{{ template "_internal/pagination.html" . }}
{{ range $paginator.Pages }}
   {{ .Title }}
{{ end }}

or

{{ template "_internal/pagination.html" . }}
{{ range .Paginator.Pages }}
   {{ .Title }}
{{ end }}
epage commented 6 years ago

I've added some sections we'll want to fill out to help guide the conversation

Activation of pagination feature will also say on which index pagination will be enabled In _cobalt.yml:

So the things you can paginate on and their permalink are being defined in _cobalt.yml but how do I associate my index.md with one of these?

epage commented 6 years ago

permalink: "/{{ year }}/{{ month }}/p{{ num }}"

When we crate an index by published_date, are they all only available by year/month or are we parsing the permalink to see what variables are being used?

Geobert commented 6 years ago

added prior art, still thinking about your questions :)

epage commented 6 years ago

Thanks!

Some feedback

Keats commented 6 years ago

For gutenberg, that only talks about what variables are available but not how you set it up.

It's in https://www.getgutenberg.io/documentation/content/section/ In short you just set paginate_by to a positive integer in any section front-matter and it will paginate this section by this much. I'll add some docs on that I guess

Geobert commented 6 years ago

updated proposal :)

epage commented 6 years ago

I just update the jekyll v2 section with some interesting highlights.

In particular, I want to call out the fact thatthe global settings are defaults and not the main way to configure. I think its important that we allow pagination configuration to happen in the frontmatter. I then assume we should just rely on the frontmatter default system we have in the config file for pagination rather than creating another way of creating unique instances of pagination.

btw for Hugo, I think _index.md is what they use but this is one area where I feel there docs are lacking https://gohugo.io/content-management/organization/#index-pages-index-md

Geobert commented 6 years ago

Thank you :) What are trails?

I agree with the fact that _cobalt.yml should hold only defaults and same values can be override in fronts

epage commented 6 years ago

What are trails?

https://github.com/sverrirs/jekyll-paginate-v2/blob/master/README-GENERATOR.md#creating-pagination-trails

Geobert commented 6 years ago

Nice, I'll add this to our version :)

Geobert commented 6 years ago

updated the config to pagination instead of index, more intuitive.

add the info that we can override the config values in frontmatter

Geobert commented 6 years ago

Added trails, total ripoff of Jekyll pagination v2

epage commented 6 years ago

I agree with the fact that _cobalt.yml should hold only defaults and same values can be override in fronts

I think it'd be a good exercise to document the frontmatter changes first and then to see how that plays out with the _cobalt.yml configuration.

Currently, the way to default frontmatter is to set is by filling in the default sections in the configuration. Maybe some of that will work with pagination defaults? Maybe some will still need a unique way to default in the config. Maybe some won't need defaults because they are too unique.

And if we do maintain defaults in the config, we should also consider whether the defaults are set globally and/or per-collection.

Geobert commented 6 years ago

What do you mean? That I write the documentation first?

epage commented 6 years ago

I'm not expecting documentation, just a description, like you did for _cobalt.yml, of what will be supported in the frontmatter after this.

Geobert commented 6 years ago

Sorry for this huge delay, a video game sucked me out real life x) I've updated the proposal with more paginator object description. All the values in _cobalt.yml are available on each page front as well if needed to be override.

epage commented 6 years ago

Understandable. I've had health issues in the family plus the CLI-WG sucking me away.

Not sure if its lack of clarity on my part or if it fell through the cracks since its been so long since we've talked, but I feel like some of my feedback hasn't been applied.

I think it'd be a good exercise to document the frontmatter changes first and then to see how that plays out with the _cobalt.yml configuration.

Geobert commented 6 years ago

Maybe I'm misunderstanding something but this part:

all the values in config can be overriden in frontmatter of each page

Means all the values I defined in the _cobalt.yml are valid for the frontmatter as well. They will override the value set in _cobalt.yml

I hope the health issues are not too serious and that everybody will get better soon :-/

Geobert commented 6 years ago

I'm trying to understand how Cobalt is working to identify where I will put the pagination code. Correct me if I'm wrong: I'll need to work on liquid.rs project as well in order to make this work, won't I? If I got it properly, Cobalt is only responsible for:

But the rendering is done by Liquid.rs, isn't it?

So, for the pagination, I'll need to enrich Liquid to understand what to do with a paginator, and Cobalt will generate Collections for each needed index and determine their destination on the hardrive. Am I correct?

EDIT: this is totally wrong, no need to add anything to Liquid at least for the for block part

Geobert commented 6 years ago

Also, I'm studying Liquid, and especially the for loop block. But I can't understand how collections.posts.pages is resolve in Liquid, can you give me some insight on this please? :)

Geobert commented 6 years ago

Got it! It's in generate_doc in cobalt.rs when we put "collections" into globals :D

Geobert commented 6 years ago

After studying the code and made the refactoring I understand a bit more the implication of pagination. I'm rewriting the proposal

Geobert commented 6 years ago

Rewritten proposal posted! :D

epage commented 6 years ago

I've made an attempt at closing our any of the comments that have been resolved to make it easier to browse the discussion. I feel like github needs a batch hide option :).

We've been going back and forth on this several times. I'm wondering if scheduling a discussion on gitter would be helpful. I'll go ahead and give this another shot.

How appropriate is it for pagination information to only be configured globally? Should it instead be configured on the actual page?

I suspect that it should be on the page.

The current pattern in cobalt for this is to define this configuration on the Frontmatter. In going this route, I'd expect a proposal to document these new Frontmatter fields. The only reason to discuss _cobalt.yml is to discuss how defaulting works which is of particular note here.

The way to globally set defaults for the frontmatter is then to use default, posts.default, and pages.default. The challenge is we don't want to globally enable pagination. A limited way of handling this is the user sets all of the default fields they want except an enable flag. They just then need to enable it and get the rest.

This makes defaulting limited to one kind of index, whether by year, category, etc. Its probably common enough for users to configure multiple, that we should consider our options. One option is to adopt jekyll's defaulting system which is based on global patterns, like "if the file uses this path, set these defaults".

An alternative system is to provide named defaults that a page opts-in to.

Here is a rough sketch of how named defaults might work

_cobalt.yml:

pages:
  default:
    data:
      foo: bar
posts:
  default:
  - name: ""
    permalink: "/something"
  - name: "category_index"
    pagination:
       ...
  - name: "date_index"
    pagination:
       ...

index.md:

default: "category_index"
---
Hello world!@

(field names subject to change, this is just for illustrative purposes)

With serde, we can make a fields support multiple types of values. I chose to make default be a Either<FrontMatter, Vec<KeyedFrontMatter>> (where KeyedFrontmatter adds name) rather than using a HashMap to avoid ambiguous situations where it might not be clear whether is is a Frontmatter or a HashMap.

Of course this is me exploring one line of solving these problems based on the current design of cobalt, allowing for consistent patterns for the user to be aware of. There are of course other possible solutions and it'd be great for us to consider and evaluate them. I think this gets down to recording requirements which should be the first role of a proposal which we don't have written down and agreed to yet.

Geobert commented 6 years ago

I think this gets down to recording requirements which should be the first role of a proposal which we don't have written down and agreed to yet.

I've seen you have filled the section :) My first need is to paginate collections.posts.pages and then in a second time, I'll need tag and year/month as well. I don't know if my first need is covered by one of your proposals.

How appropriate is it for pagination information to only be configured globally? Should it instead be configured on the actual page?

This is a good idea, at least for my use case, it's true that having global default value then only having a switch is not really logic. I was following Jekyll's scheme here. So I'm all for "only in the front configuration" idea.

  • Should we support narrowing by additional tags?

In a second step, I think yes but maybe having a first version working then enrich it?

  • Should we support hierarchical tags (like parent::child)?

For me, tags have no hierarchy, categories are here for that.

  • Should we support nested indexing, like tags within a category?

Hm… not sure, but even if yes, let's not do a mega big feature in one shot, let's paginate collections.posts.pages first and then move forward?

Otherwise the feature will take years to get out.

epage commented 6 years ago

For me, tags have no hierarchy, categories are here for that.

Oh the joy of complex data models to map :)

Hm… not sure, but even if yes, let's not do a mega big feature in one shot, let's paginate collections.posts.pages first and then move forward?

Otherwise the feature will take years to get out.

I support that as long as we don't feel we have designed ourselves out of doing more.

epage commented 6 years ago

I've seen you have filled the section :)

Well, started it. There are probably more for us to consider.

Geobert commented 6 years ago

I've read what you wrote as requirements and I quite agree with all. I just added published_date with no year nor month in order to paginate collections.posts.pages.

epage commented 6 years ago

I've read what you wrote as requirements and I quite agree with all.

Feel free to speak up if you do have a concern over an edit I make. This is a proposal, even my contributions, and all is fair game.

I just added published_date with no year nor month in order to paginate collections.posts.pages.

I redid that to just include all content. I assume your intent was sorting. That is orthogonal, so I created a separate item for it.

Geobert commented 6 years ago

Updated the proposal with no global config and activation from frontmatter

epage commented 6 years ago

The keys under pagination are the types of content you can paginate?

pagination:
    posts:
        enable: false // mandatory
        per_page: 10
        permalink: "/_p/{{ num }}/"
        trails:
            before: 0
            after: 0

Why not make that a field?

pagination:
  include: categories // default: `None`, can also be `all`, `years`, etc
  per_page: 10
  permalink: "/_p/{{ num }}/"
  trails:
    before: 0
    after: 0

Also, we can make the schema auto-adapt

Geobert commented 6 years ago

How do you specify different per_page by types of content? Or different permalinks as well etc…

epage commented 6 years ago

I think I'm missing something in your comment.

Are you referring to my auto-adapt schema? Those are shortcuts for the full-form and defaults are applied to fields, like when declaring a dependency in Cargo.toml, you can say toml="1.0" and toml = { version = "1.0" }.

Geobert commented 6 years ago

I thought you were suggesting to use include instead of

pagination:
  post:

but if it's to use defaults values, yeah, good idea :)

I don't know how the permalink part will fit though.

epage commented 6 years ago

Good points. It could be that the idea of the shortcut might not work.

Some options, though probably not great:

epage commented 6 years ago

To add, we should probably just note that this is an idea for future consideration in the "Potential" section.

Geobert commented 6 years ago

We could translate each include directly to a permalink variable and append them, in order, to {{ parent }}/.

This is also what I had in mind while thinking how permalink can fit with shortcuts. I'll go for that.

I've started to work (slowly) on the implementation to get my head around this feature and to understand more on how cobalt works. I'm ok with redoing it while this discussion evolved to match the specification but I need to code to get more details that I might miss.

epage commented 6 years ago

Sounds good.

I've cleaned up the proposal, added sorting fields, and documented how user-defined defaults can work, At some point we'll need to compare the proposal against requirements to make sure they are all satisfied.

Geobert commented 6 years ago

While coding, I noticed that first_page and last_page have no added value so I didn't put them in. We keep first_page_path and last_page_path of course.

Geobert commented 6 years ago

I'm trying to get my head around the permalink permalink: "/{{parent}}/{{include}}/_p/{{ num }}/" concept.

Do we have something that already translate this into a file system path? Or do I need to code how to interpret {{parent}}?

epage commented 6 years ago

Do we have something that already translate this into a file system path?

To expand a permalink https://github.com/cobalt-org/cobalt.rs/blob/master/src/document.rs#L97

To format it as a file path https://github.com/cobalt-org/cobalt.rs/blob/master/src/document.rs#L121

Or do I need to code how to interpret {{parent}}?

parent is already dumped into the available variables: https://github.com/cobalt-org/cobalt.rs/blob/master/src/document.rs#L40

Geobert commented 6 years ago

Thanks! What about {{num}}?

I've ended with https://github.com/Geobert/cobalt.rs/blob/pagination/src/cobalt_model/pagination.rs#L212

I might need to rewrite this to use the existing functions

EDIT: and {{include}} is new as well

Geobert commented 6 years ago

Yeah I need to rewrite this, or I'll end to rewrite the same thing anyway for future pagination include (year and categories for exemple).

I need to make the functions you showed me as pub is that ok? As for the 2 new {{num}} and {{include}}, do I treat them in a new func or inside permalink_attributes?

If in the existing permalink_attributes, I need to add more params, so it may not be a great idea.

epage commented 6 years ago

I need to make the functions you showed me as pub is that ok?

src calls into src/cobalt_model, we shouldn't be doing it the other way around.

I'd recommend preparing a PR that splits the parsing logic out into a src/cobalt_model/permalink.rs file. permalink_attributes would not move over. I've not dug in to see how you plumb everything together but hopefully we can keep that near document logic for now.

As for the 2 new {{num}} and {{include}}, do I treat them in a new func or inside permalink_attributes?

Separate function. Merge the objects.

What is {{include}}? Its not listed in the RFC.

Geobert commented 6 years ago

Ok for the PR, I'll work on that tonigth :)

What is {{include}}? Its not listed in the RFC.

It's this:

pagination:
  include: categories // default: `None`, can also be `all`, `years`, etc
epage commented 6 years ago

Oh, include is what we are paginating over. I was looking in the RFC for where we talk about permalinks and it wasn't discussed there. I added a brief permalink section. Feel free to correct me :)

btw have we figured out a solution to this open question in the RFC: "Should we ensure there is a way to avoid a 0 when that is the index?"

A lot of times, when paginating, there is a page and a page/1 or something. Maybe we could expose the default filter in liquid and do permalink: "/{{parent}}/{{include}}/{{ num | default "" }}/"

Geobert commented 6 years ago

Feel free to correct me :)

No need to :D

"Should we ensure there is a way to avoid a 0 when that is the index?"

What do you mean? 0 page? if so paginator will have an empty posts attribute

A lot of times, when paginating, there is a page and a page/1 or something.

I don't understand what you are talking about, do you have an example? As for the filter, it can't hurt to add functionality to liquid :) I can do that in a second time.

epage commented 6 years ago

"Should we ensure there is a way to avoid a 0 when that is the index?"

Say there are 10 pages.

, it can't hurt to add functionality to liquid :) I can do that in a second time.

FYI I kept the liquid parser we use for permalinks barebones with the idea that we would add features as needed. It felt weird to have things like for for doing permalinks.

Geobert commented 6 years ago

On the user point of view, I think 1-index is more natural. It's what I've coded. for the trailing number, I let the user do it using {{ paginator.page }} / {{ paginator.total_pages }}

Geobert commented 6 years ago

permalink_attributes would not move over.

How do I call it to interpret parent then, as it's not pub?