optimize file parsing, by reading header separately from body

cloudhead / toto

the 10 second blog-engine for hackers

MIT License

1.49k stars 244 forks source link

optimize file parsing, by reading header separately from body #6

Open cloudhead opened 14 years ago

davejacobs commented 14 years ago

What if you dynamically created an article index file (YAML or something) via the rake publish command? (Or earlier via some sort of a rake index command.) This file would contain indexes for each file name (specifically for determining file order and possibly including metadata), which would free us from putting the date in each file's name?

I'm thinking that people who want to use Ruby to blog need several options, from static frameworks/generators (like Webby) to dynamic, database-free solutions (like Toto) to simple database solutions (like Sinatra + ActiveRecord) to full solutions (like Rails).

It seems to me that Toto could expand its scope just a little bit to give database-like functionality (via an index file) without actually using any databases. This kind of a setup wouldn't even require write access on a server (like Heroku), as the file is generated when before uploading/git push.

What do you think? Should we abstract metadata (like date and order) away from the file name?

cloudhead commented 14 years ago

from a usability point of view, I must agree — not having to worry about the filename is a plus. I'm not sure the extra complexity is worth it though, I'd have to think about it.. Maybe if it was in the form of a 'module', separate from toto.rb, it could work.

cloudhead commented 14 years ago

As I understand, this wouldn't only solve the file-name problem though, but would give the ability to have multiple indices on the articles. This is the part which I think would add complexity.

to index slug -> filename mappings would probably be pretty simple, and could indeed be refreshed with a rake index command. It should be an optional module though.. I think people will forget to regenerate the index pretty often.

davejacobs commented 14 years ago

Thanks for the reply.

I think the extra complexity could be worth the effort, and I'm willing to put that effort in. Specifically, making these changes will allow me to omit the post day from my permalinks but still load the articles properly and in order. (And perhaps faster.)

It will also make queries based on article metadata much easier/faster, for example selecting all pages with a certain tag or author.

A couple of thoughts:

I agree the changes should be modular and optional
I agree that the slug and filename could be mapped
To solve the index problem, one might:
- update the index after a person first creates and saves an article with rake new--not perfect, but a good start
- warn the user during rake publish if the index file is older than the latest article file--this obviously doesn't help people who use git commands instead

What do you think?

Maybe I'll try to whip up a module that implements pseudo-database functionality and see where things go.