antwarjs / antwar

A static site generator built with React and Webpack.
https://antwar.js.org/
MIT License
460 stars 35 forks source link

Implement blogger/ghost importer #1

Closed bebraw closed 9 years ago

bebraw commented 9 years ago

This can be developed on top of JSON output of https://github.com/bebraw/blogger2ghost . Just need to convert that JSON to Markdown with suitable YAML headmatter.

eldh commented 9 years ago

Good idea. I suppose using commonmark to convert to markdown, and then just smacking the rest (at least the relevant parts) into the frontmatter. You might want to have an option clean the data, so that some keys are changed or filtered.

bebraw commented 9 years ago

Here's an example of data you get out of the converter.

We should specify mapping for the data. As you can see there's a lot more data than in YAML headmatter at the moment. I think it would be a good idea to split date (created, published, updated?). If a post doesn't have published date, you can probably consider it a draft and exclude it of output.

I am a bit unsure about headerImage. If a post has images, maybe the first one could be picked. Is there a better heuristic? What if the image happens to be too small? Does the size matter?

preview is another one as the data is missing that. Maybe it could be just n first words out of the content? This rule could be configurable.

Tags are quite simple. I guess in this case you would just generate a list of tag names per post. If you want to make this more flexible, ids (can be slug, no number needed) can be used instead. That would be a good move as it would allow i18n over longer term.

Routing is one extra thing to worry about although it sounds like a separate issue. It would be good if my original blog links would still work (reroute to new ones). I guess this might mean I would need to generate an extra file for you with the mappings. How does that sound?

eldh commented 9 years ago

I think providing a barebones default mapping would be good, but being able to change/add to that mapping would be nice.

Antwar, at its core, only cares about two things in the .md files, the title and the content. The rest depends on what the theme (in this case Post.coffee and its sub-components) uses. And everything in the forntmatter part will be available through the PathsMixin. headerImage, for example, is just an optional field that's only relevant for this theme.

Yes, routing is an issue. now we generate the url from the file name, but we could look in the frontmatter also. Still, a mapping table to do 301 redirects is probably what you want to do. I son't really know how that works though. Putting the original url in the frontmatter should give us everything we need I suppose.

bebraw commented 9 years ago

Antwar, at its core, only cares about two things in the .md files, the title and the content. The rest depends on what the theme (in this case Post.coffee and its sub-components) uses. And everything in the forntmatter part will be available through the PathsMixin. headerImage, for example, is just an optional field that's only relevant for this theme.

Ok. I guess headerImage related logic, if needed, could be pushed to some plugin. I see now that the basic importer can be very simple.

The same idea applies to tags so that as well goes beyond the basic requirement.

Images are going to be bit of a problem. I would rather host them myself in the repository. I suppose this should go to a plugin that goes through the urls, fetches data and replaces the urls with the correct reference. Finally you should just commit that to the repo and off you go.

Yes, routing is an issue. now we generate the url from the file name, but we could look in the frontmatter also. Still, a mapping table to do 301 redirects is probably what you want to do. I son't really know how that works though. Putting the original url in the frontmatter should give us everything we need I suppose.

What if there was a transformation function that would take post metadata as an input and output url(s)? You could generate whatever slugs you need or more complicated structures even and define redirects here. Basic idea:

// basic
function basicUrlify(post) {
    return post.category + '/' + slugify(post.title);
}

// multiple urls per post
function multipleUrlify(post) {
    return [post.category + '/' + slugify(post.title), dateify(post.date) + '/' + slugify(post.title)];
}

// redirect
function redirectUrlify(post) {
    var to = dateify(post.date) + '/' + slugify(post.title);

    return [
        {
            from: post.category + '/' + slugify(post.title),
            to: to,
        },
        to,
    ];
}

As you can see, this could be a good extension point and it gives you a lot of control over the shape of urls.

eldh commented 9 years ago

Url functions looks good!

Images should be placed in the assets folder and then the corresponding urls have to be rewritten.

And yes, the headerImage url logic should be moved.

bebraw commented 9 years ago

Can you provide a minimum spec for the mapping (ie. which fields should I generate)? I could try to whip up something.

I'll open a separate issue for that url thing.

eldh commented 9 years ago

The way it looks like right now is like this:

YYYY-MM-DD-[url].md

---
title: [title]
---
[content]

You could also put the url in the frontMatter and add a line in paths.coffee -> allPosts() to check for the url there, I suppose.

Is that enough to get you started?

bebraw commented 9 years ago

Yeah, I'll do a quick stab at it in a separate repo. The tool will look like this: ghost2antwar < inputDirectory > outputDirectory.

I'm not sure about that paths.coffee -> allPosts() bit but maybe you can deal with that. The basic converter should give you some data to play with.

Ghost can import Wordpress too so you get a route from Wordpress to antwar too.

eldh commented 9 years ago

should I create an org for antwar and add you as a maintainer? Then we can keep all the repos there?

On Fri, Feb 27, 2015 at 1:04 PM, Juho Vepsäläinen notifications@github.com wrote:

Yeah, I'll do a quick stab at it in a separate repo. The tool will look like this: ghost2antwar < inputDirectory > outputDirectory. I'm not sure about that paths.coffee -> allPosts() bit but maybe you can deal with that. The basic converter should give you some data to play with.

Ghost can import Wordpress too so you get a route from Wordpress to antwar too.

Reply to this email directly or view it on GitHub: https://github.com/eldh/antwar/issues/1#issuecomment-76384860

bebraw commented 9 years ago

should I create an org for antwar and add you as a maintainer? Then we can keep all the repos there?

Sounds good!

eldh commented 9 years ago

damnit, antwar is taken...

On Fri, Feb 27, 2015 at 1:10 PM, Juho Vepsäläinen notifications@github.com wrote:

should I create an org for antwar and add you as a maintainer? Then we can keep all the repos there?

Sounds good!

Reply to this email directly or view it on GitHub: https://github.com/eldh/antwar/issues/1#issuecomment-76385529

bebraw commented 9 years ago

I set up a basic importer at https://github.com/antwarjs/ghost2antwar . You should be able to run the demo. cli hasn't been set up yet (I'll do that in a bit).

Further issues should go to that repo.