Open adamghill opened 2 years ago
I want to work on this, especially the tags and publishDate options. In the code you already parse the date
keyword.
I assume that publishDate
will work the same way.
Do you think it would be possible to have a lastUpdatedDate
that would automatically update itself when changes are made to the markdown file? It seems you already have something like that for the static
mode.
I'm building a personal blog (the most common basic dev project 😆 ever) and I keep changing my mind about how it will work lol. At first I was using a Post
model with most of the data: title, tags, etc... attached to it with coltrane just used to render the markdown content, with that setup it was easier to have a last_updated
field but then I decided that if I was going to use coltrane I should just go all the way with it and have all the data related to a specific post in one place, the markdown file. I'm not sure that's the best approach but at least it gives me the opportunity to contribute to coltrane itself and it's a personal project, nobody cares. But now I need to figure out a way to know when a file was last modified 🤔 , I could just update the lastUpdated date myself, but this will be my last resort
For tags, I was thinking of adding some basic utilities (maybe template tags) for it:
I tried and you can already have something like this in the frontmatter
---
tags:
- javascript
- django
- python
---
and it will parse it as a python list, there is not much left to do.
I didn't know frontmatter before I started using coltrane, to be sure frontmatter parsing is handled by the python-markdown2
metadata
extra right?
Well, I'll take all the help I can get! 😄
Yep, python-markdown2
handles the frontmatter already, but I think it would be useful to provide documentation about standard defaults and then some ways to get a list of tags -- your utilities/templatetags ideas make sense to me. I do wonder if there is an efficient way to collect all the tags so it doesn't take forever with a lot of markdown files? Maybe something with the manifest file I use when re-generating static files? Although, that doesn't get used in integrated mode currently I don't think. We could just use the Django cache system perhaps?
We could handle publish_date
the same as date
. Although now that I'm thinking about it... I can't think of a time where I'd want both a date
and a publish_date
, so maybe it should just be publish_date
since it's more explicit?
Do you think using the markdown file's mtime
would be sufficient as a proxy for last_updated
? I'm unclear if that would work once you deploy a file code to production, i.e. would the markdown file modified datetime be from the original system or when it was modified on the prod system?
We could handle
publish_date
the same asdate
. Although now that I'm thinking about it... I can't think of a time where I'd want both adate
and apublish_date
, so maybe it should just bepublish_date
since it's more explicit?
You are right, in my case for example I just use publish_date
it makes more sense, so we could just look for publish_date
in the frontmatter. A new file in the common
folder of the docs would be useful to document the standard keyword that coltrane supports.
Yep,
python-markdown2
handles the frontmatter already, but I think it would be useful to provide documentation about standard defaults and then some ways to get a list of tags -- your utilities/templatetags ideas make sense to me. I do wonder if there is an efficient way to collect all the tags so it doesn't take forever with a lot of markdown files? Maybe something with the manifest file I use when re-generating static files? Although, that doesn't get used in integrated mode currently I don't think. We could just use the Django cache system perhaps?
From what I've seen, manifest files are not used in any mode other than static
but yes, I think using them to store tags might be a good idea. If we choose this option, there should probably be a management command for users in integrated mode (I'm not sure if this applies to standalone mode) to generate these manifest files and documentation to explain that this is recommended to improve the performance of the provided tags utilities.
small aside, using the manifest files just gave me the idea to use a similar file to store the indexes for the search feature with lunr.py
The django caching system could be a great option since users could later add something like redis to get better performance, but I think starting with the manifest and implementing something cache-based later as the second option is better. My reasoning for this is that the in-memory cache that django uses by default is less reliable than disk files (manifest) in most cases, I think. Any other caching backend would require additional infrastructure, a database, redis, etc...
Do you think using the markdown file's
mtime
would be sufficient as a proxy forlast_updated
? I'm unclear if that would work once you deploy a file code to production, i.e. would the markdown file modified datetime be from the original system or when it was modified on the prod system?
🤔 mtime
may not be the best option to get a last_updated
date in most cases, but in some cases it may be close enough, we should at least give the user the option to use it if they want. Perhaps a template tag? or injecting the ManifestItem
into the context?
Injecting the ManifestItem
seems easier to implement, if the manifest is not present, then the value will simply not be in the context.
This is another reason to add a generate-maniftest-files
management command for the integrated mode.
The django caching system could be a great option since users could later add something like redis to get better performance, but I think starting with the manifest and implementing something cache-based later as the second option is better. My reasoning for this is that the in-memory cache that django uses by default is less reliable than disk files (manifest) in most cases, I think. Any other caching system would require additional infrastructure, a database, redis, etc...
On second thought, something based on Django's caching system would be easier to implement and there is a file-based backend that static sites could use instead of the default locmem if my concerns about it are valid.
I originally thought about the filesystem cache instead of manifest.json
, but I liked the idea of something that was easily readable. But, I ended up writing a lot of code to create/load/parse that file which was pretty annoying. It might not be worth the hassle.
A new file in the common folder of the docs would be useful to document the standard keyword that coltrane supports
I have this, but 1) it doesn't even have date
and 2) it might be more useful split out into its own document to make it more clear anyway.
I originally thought about the filesystem cache instead of
manifest.json
, but I liked the idea of something that was easily readable. But, I ended up writing a lot of code to create/load/parse that file which was pretty annoying. It might not be worth the hassle.
I'll try to implement something for tags based on the cache system and see how it goes. Maybe we could migrate the current manifest system to cache later to simplify the code.
I have this, but 1) it doesn't even have
date
and 2) it might be more useful split out into its own document to make it more clear anyway.
I'm going to make a PR to rename date
to publish_date
and add a new section to common
docs, what should it be called?
TemplateContext
or maybe just Context
?
Context
works for me, thanks! Do you want a separate issue for the publish_date
stuff?
Context
works for me, thanks! Do you want a separate issue for thepublish_date
stuff?
Nope, it is not necessary.
Here is a draft implementation of what the tag utilities might look like
from coltrane.config.cache import Cache
from coltrane.retriever import get_content_items, ContentItem
@dataclass
class DataCache(Cache):
def __init__(self):
super().__init__("CONTENT_ITEMS_CACHE")
def get_content_items_with_tags():
# cache here
return [
item
for item in get_content_items(skip_draft=False)
if item.metadata.get("tags") and not str(item.path).endswith("index.md")
]
def all_unique_tags() -> set[str]:
tags = chain(*[item.metadata.get("tags") for item in get_content_items_with_tags()])
return {tag.strip().lower() for tag in tags}
# add an exclude parameter
# lru_cache maybe
def filter_by_tags(
tags: list[str], include_all_tags: bool = False
) -> Iterable[ContentItem]:
checker = all if include_all_tags else any
return [
item
for item in get_content_items_with_tags()
if checker(tag in item.metadata.get("tags") for tag in tags)
]
This is obviously far from complete, just to give you an idea, I will turn them into template tags. I'll stop bothering you for now @adamghill 😄 Have a nice weekend
Nice! Looking forward to seeing the end result. 👍 Do you think it would be useful to add a tags
field to ContentItem
? That might encapsulate the couple of item.metadata.get("tags")
in your code above.
One other thing that might be useful for all_unique_tags
(or another templatetag, maybe?) is the count of content that has a particular tag. Not sure if that would return a tuple
or another dataclass
or something else.
Do you think it would be useful to add a
tags
field toContentItem
? That might encapsulate the couple ofitem.metadata.get("tags")
in your code above.
Yes, I was thinking of adding a tags
property to ContentItem
and later title
and description
for text based search.
One other thing that might be useful for
all_unique_tags
(or another templatetag, maybe?) is the count of content that has a particular tag. Not sure if that would return atuple
or anotherdataclass
or something else.
I thought about it and getting the count could be accomplished with the django length
tag combined with filter_by_tags
. This would be explained in the docs of course but I see no reason for a dedicated template tag at the moment.
def get_content_items_with_tags(): # cache here return [ item for item in get_content_items(skip_draft=False) if item.metadata.get("tags") and not str(item.path).endswith("index.md") ]
I was also thinking about something, would it be better to only cache the items with tags and leave the get_content_items_with_tags
as is or maybe we should just cache all ContentItem
when get_content_items
is called. This way it could be useful for other use cases, for example for the search feature or custom template tags that relies on get_content_items
.
maybe we should just cache all ContentItem when get_content_items is called
I think this makes sense if we can bust the cache intelligently.
Not sure if this is the right thread for my question. Does Coltrane support generating a separate page for each used category/tag (e.g. with categories or tags defined in the frontmatter) in 'static site' mode? What would be required to acchieve this? Many thanks!
Not sure if this is the right thread for my question. Does Coltrane support generating a separate page for each used category/tag (e.g. with categories or tags defined in the frontmatter) in 'static site' mode? What would be required to acchieve this? Many thanks!
Hi @jimmybutton , as far as I know, no, Coltrane does not support this at the moment. There are only two ways I can see this working:
Create a category page for each category / tag you have in advance and then use the directory_contents template tag to filter your content. When your site is built, it will include a page for each category with links to related content. This sounds really tedious though, maybe making a command for this is a feature worth considering @adamghill
Write some js to do the filtering in real time and ship that js code with your site. Lunr.js could perhaps help.
I've been busy lately but I had already planned to build a tag filtering and search feature for Coltrane, but what I'm thinking about will only work in integrated
and standalone
mode. I don't see an easy way to make it work in static mode right now.
I'm sorry I could not be of more help.
@Tobi-De Thanks for your reply and great ideas 👍! I think I'll have a go at the first option you described and see if I can get it working.
Look through https://gohugo.io/content-management/front-matter/ and see what makes sense (and is easy) to support.
Some of these might not need explicit support, but could just be added to documentation.
After skimming the list:
datetime
, could be used by url?)bool
, respected by output command?)datetime
, respected by output command)