Strip extension from URLs

wichert commented 12 years ago

For websites it is commonly not required to use an extension for pages. W3C even discourages it since it needlessly exposes an implementation detail in URLs. Not using extensions also allows for resource negotiation where a web-server can, for example, choose between html, wap and xml versions of a page.

Since I am trying to migratie an existing site that does not use extensions, and since I personally prefer not to use extensions for documents in URLs, I am looking for a way to have StrangeCase strip the .html extension from the url() result.

colinta commented 12 years ago

This should be easy, either using a configurator or by changing the rename_extensions mapping.

The config['configurators'] setting is a list of functions that get handed the source_file and "current" config (as it has been determined based on defaults and other configurators). Returning None will result in the node being skipped.

Here's one that would do the trick:

def strip_html_ext(source_file, config):
    if config['target_name'][-5:] == '.html':
        config['target_name'] = config['target_name'][:-5]
    return config

Add that to the list using config.py:

config.py

from strange_case.strange_case_config import CONFIG  # the defaults

def strip_html_ext(source_file, config):
    if config['target_name'][-5:] == '.html':
        config['target_name'] = config['target_name'][:-5]
    return config

CONFIG['configurators'].append(strip_html_ext)

Another way to do this is to use the "rename_extensions" configuration, and change the defaults to be empty strings:

rename_extensions:
    .j2: ""
    .jinja2: ""
    .jinja: ""
    .md: ""

Does this solve things?

colinta commented 12 years ago

I'm playing with this, but it messes with some things... namely the "auto-detection" of html files by checking the extension.

If there's a way you would prefer to fix this, let me know.

wichert commented 12 years ago

Python style nitpick: it is more efficient to do config['target_name'].endswith('.html') than slicing & comparing.

I think the main issue with using configurators and rename_extensions is that they work on filenames coming in, while stripping extensions should probably only work on urls being generated. My gut feeling is that the simplest solution is a strip_extensions: [".html", ".xml"] setting in config.yaml and check that in the url() methods.

colinta commented 12 years ago

Ahhh! This sounds much easier. Pulling off the extension from the file was causing all sorts of strange side effects.

So you're fine with a file called foo.html, but the URL should be /foo - is that right?

config.py

from strange_case.strange_case_config import CONFIG

def strip_html_ext(source_file, config):
    if config['url'].endswith('.html'):  # much nicer, thanks :-)
        config['url'] = config['url'].rstrip('.html')
    return config

CONFIG['configurators'].append(strip_html_ext)

Or, going with your configuration solution:

from strange_case.strange_case_config import CONFIG

CONFIG['strip_extensions'] = ['.html', '.xml']

def strip_extension(source_file, config):
    for extension in config.get('strip_extensions', []):
        if config['url'].endswith(extension):
            config['url'] = config['url'].rstrip(extension)
            break
    return config

CONFIG['configurators'].append(strip_extension)

wichert commented 12 years ago

That is correct: I only care about the URL being generated. I have no problem with using the actual filename in other places, I just don't want to necessarily expose that detail in the generated site.

That configuration solution looks useful. Is that something you want to add in doing in StrangeCase itself?

colinta commented 12 years ago

Hmm... not sure how I feel about it. I've been thinking that the "core" is getting bulky, and that I could trim it down by moving things into the extensions/ folder (e.g. auto-detecting created_at and order from the file name, detecting a default human-readable title from the filename, etc). This would be a good candidate for that.

In that case, I might have to give the extensions a way of adding default values to the configuration, which they currently cannot do. I would think that including strip_extensions would include its default configuration as well.

I'm gonna ponder this for a day. I will, most likely, add it to the extensions/ folder (and move other configurators in there as well).

Thanks! You are the first person, to my knowledge, that has "stumbled" upon StrangeCase. A lot of my friends who are designers have really taken to it (one is switching from WordPress, and she really likes it). I'm wondering how you found it, and what made it stand out compared to hyde.

wichert commented 12 years ago

I actually hadn't run into hyde before. The reason I picked it over other static website generators is mostly that pretty much every generator out there assumes you are making a blog and only allow listing a single thing. That is fine for a blog, but if you want to use multiple lists you are stuck.

colinta commented 12 years ago

Good to hear! That is one of the reasons I wrote StrangeCase. Any site you can mockup, you should be able to build more easily with StrangeCase. That was my main goal. The other was simplicity and a straightforward design. I've strayed pretty far from that, as I've added features I needed. The next refactor is gonna bring all that back in line, I hope.

wichert commented 12 years ago

I was wondering if you've completed your pondering, or are you considering to do something related as part of the Next Refactor?

colinta commented 12 years ago

Definitely gonna add it as an extension, and I will be moving existing things into there as well.

Hopefully I'll have time to do this soon, it does require a refactor. The last time I sat down to do it, I ended up focused on performance (good news: improvements on very large sites!)

colinta commented 12 years ago

pushed! welcome, v4.0.0.

colinta / StrangeCase

Strip extension from URLs #5