Closed jaredcwhite closed 3 years ago
(Not specifically stated, but I think we should consider making the source/reader object for a collection really smart, i.e., it could actually pull data using Ruby libraries, caches, whatever…even going so far as to connect to ActiveRecord so you could load content right out of a DB using Rails. I know that sounds kind of nuts, but I'm thinking big here!)
One other thing we need to be mindful of…we have unique performance requirements compared to dynamic request/response frameworks because a teeny, tiny change in memory/CPU load could make the difference between a few seconds and a few minutes for a really large site build. So I'd hate to redo everything here and then find out Bridgetown is suddenly way behind Jekyll/Eleventy/etc. Probably the best way to think about it is to identify the "happy path" — aka a typical site configuration — and streamline that as much as possible. If we have a good benchmark suite ahead of time, we can do A-B between the current system and the new one to identify pain points.
Last comment for now (I swear!) — this is also a good opportunity to rely more heavily on ActiveSupport and potentially other gems so we get the benefit of their hard work and optimizations and don't have to write so much from scratch ourselves.
@jaredcwhite This all sounds super great and I hate to bring out the ol 🖌 (for some bike shedding - emoji options for paint aren't great) but atm I really dislike content
.
I need to noodle on alternatives and I may throw out a few (some will be awful) suggestions. Totally fine if that's the best we have but it feels wrong at this moment.
Alternatives (edited) naming is hard - please wr'k the bad ones 😛
@andrewmcodes Yeah I'll admit I'm not crazy about the over-utilization of the term content
either…resource
perhaps captures more nuance around what we're actually talking about. Probably would shy away from the other options. Still TBD
FYI: I'm keeping a diary of sorts in the #upcoming
channel of the Bridgetown Discord as I work on this. Follow along there if you dare! 😉
Interesting… 🤔 https://craftcms.com/features/all#section-types
In terms of ecosystem impact, my hope is that after doing all this, most existing sites would work "as is" from the user's perspective, and any external Bridgetown plugins would only need slight tweaks to work with the new Page/Document classes
Well, that ended up not being the case. I've made a lot of breaking changes — not to existing sites using the legacy engine (any major breakage there would be considered a bug), but when switching to the new resource engine a lot of Liquid/ERB syntax will need to change and plugins will need to be updated. It's painful, but this is the only time we can get away with it. A year or two from now and such a shift would be extremely upsetting. I'm not fond of moving farther away from Jekyll compatibility, but on the other hand we're not really competing with Jekyll. We're competing with Gatsby. We're competing with Eleventy. We're competing with Hugo. We need to be fabulously good in order to be a viable contender. Just being slightly better than Jekyll, and using Ruby, isn't enough. Anyway, I'll be writing all this up more succinctly in a blog post shortly!
There's more to do after the release of Bridgetown 0.20 but I'll file them as separate issues. Closing! :tada:
January 2021 Update: work has begun on this in earnest. Using the term "resource" and not "content". (Thanks @andrewmcodes!) Development of
Bridgetown::Resource::Base
and supporting classes is now underway! Check out the "diary"…Looking ahead to what I hope to accomplish for an official Bridgetown 1.0 release and beyond, I think we need to take a hard and painful look at the distinctions between
Page
&Document
and also between theposts
collection and other collections.➡️ Bridgetown's heritage comes from Jekyll and Jekyll comes from the idea that you have blog posts and you have standalone pages (home page, about page, etc.), and anything else is a "static file". Then the concept of collections emerged, with posts being a kind of builtin collection, but posts behave differently in some respects compared to other collections and are read off the filesystem completely differently.
➡️ There's also confusion around how permalinks work and how to configure them, because the top-level
permalink
config value affects both pages and blog pots, but there's also apermalink
config possible at the collection level, plus you can addpermalink
configs via front-matter defaults which could affect anything potentially, so it's clear as mud.➡️ At the code level, I've done what I can using mixins/concerns to get the
Page
andDocument
classes to act more alike and work similarly in various respects using duck typing, but it can still be frustrating.Layout
is yet another similar-but-not-really sort of enigma.➡️ There's also the question of when to use data files and when to use collections, and in fact you can add YAML files in a collection folder and they are treated as documents with front matter and a blank content field! 🤪 Also wacky, until a recent bug fix, static files saved within collection folders were processed as "collection documents" even though they were the
StaticFile
class and were missing from the site's overall static files array.➡️ Another problem is currently categories and tags are post-specific. If you add categories and/or tags to other collections, or pages for that matter, they're invisible from any typical searching/filtering of categories/tags.
➡️ Yet another obscure problem is you currently don't have any control over the order in which content is processed on a file-by-file basis, so you can occasionally run into issues where File A is trying to display content from File B, C, D, etc. but the content for those files haven't actually been processed, so File A shows the raw markup/template string instead of the processed content. Oops! It's a non-trivial problem, because you could potentially run into circular dependencies. File A displays content from File B, but File B wants to display content from File A. Yikes. That happens virtually never in a typical site design, but you never know.
➡️ But wait, there's more! Right now there's no concrete way to determine the "source" of a particular file/piece of content if it came from an API/headless CMS—you only know if it came from an actual file on the filesystem, otherwise it's just "virtual". In addition, after it gets rendered at a particular URL, you can't backtrack—in other words, you can't determine that /a/b/c corresponds to this one object and, say, re-render that particular object.
❓ (There's also the outstanding question of how all this relates to
ActiveModel
objects that can be used to load/validate/save content in a Rails CMS-context—a project I have underway—but I think I'll save that for a future issue.)❤️ All that to say…I'll always love Jekyll to pieces, but its content modeling situation is kind of screwball and it's time for us to fix this in Bridgetown once and for all so we have a sane platform to build on for the next ten years.
So, how do we fix this? 😂
I propose creating a new namespace under
Bridgetown
calledBridgetown::Content
. Inside we'd define several classes:Bridgetown::Content::Base
— this represents a single piece of content. This is any kind of content that isn't simply a "static file" like an image or PDF. So that means page, blog post, collection document, YAML/JSON/CSV/etc. data file, whatever.Bridgetown::Content::Source
— this is attached to the content object and represents where the content came from…filesystem, third-party API, generator, etc.Bridgetown::Content::Destination
— this is attached to the content object and represents the URL/filepath where the content will be generated.Bridgetown::Content::Transformer
— this is an auxiliary object that is responsible for transforming the object data from raw input to final converted outputBridgetown::Content::Dependencies
— this would determine the dependencies required for each piece of content and use that to facilitate both the correct order of processing and also to cache in the future so a piece of content could be quickly rerendered along with just its dependencies. There'd be some default heuristics along these lines but you could manually specify dependencies on a per-object basis. (Like a product template could specifically require "products" to be a dependency and maybe just the products in its own category.)Bridgetown::Content::Taxonomy
— this would represent a particular way to classify a content item. A category would be a Taxonomy of type "category", a tag would be a Taxonomy of type "tag", etc. Site owners could easily configure any sort of Taxonomy. Looking at Hugo for example, it comes out of the box configured like so:but you could adjust that however you like.
Bridgetown::Content::Relations
— this is how a content object could be thought of as "related" to another type of object…parent-child relationships, belongs-to/has-many, etc. So you could haveauthor: janedoe
in a post's frontmatter and then maybepost.relations.author
would automatically resolve to the content object forjanedoe
. The relations themselves would probably be defined in the yml where collections are currently configured.After doing all this, we'd refactor
Bridgetown::Page
andBridgetown::Document
so they're just child subclasses ofBridgetown::Content::Base
, and we'd probably addBridgetown::StructuredData
as well to represent a YAML/JSON/etc. data structure. In addition, we get rid of separate file readers for pages, collections, and posts, and unify everything into a single file reader. I also like the idea of letting front matter itself override directory locations, so you could potentially have everything all in a top-level folder and just addcollection: posts
,collection: recipes
, etc. That would be dumb, but it would also be immensely flexible and eliminate any hard requirements for folders like_posts
,_recipes
, etc.The special behavior of posts would be basic configuration options of a collection, so any collection could potentially behave in that manner if configured. Pages would just be collection-less documents, essentially—or alternatively, create a
pages
ordefault
orunfiled
collection and use that.I'd also like to make sure we get good-quality content graphs out of all this so menus, breadcrumbs, etc. would be a piece of cake once the collections/taxonomies/relations are properly configured. (Again, Hugo leads the way on this stuff!)
In terms of ecosystem impact, my hope is that after doing all this, most existing sites would work "as is" from the user's perspective, and any external Bridgetown plugins would only need slight tweaks to work with the new Page/Document classes…not entirely backwards-compatible unfortunately, but since we're still pre-1.0, the time for breaking changes is really now if ever. Once we do this, we break free from Jekyll's gravitational pull and get to define the future of Bridgetown on our terms. Very exciting!
Please note all the above class names are purely theoretical at this point and subject to deliberation and further brainstorming, so please let me know what you think and if I'm missing any important aspects of quality content modeling. We shouldn't shy away from looking at how other CMSes and site generators do this stuff and aim for providing as much power and flexibility as we can right out-of-the-box.