axitkhurana / buster

Brute force static site generator for Ghost
MIT License
802 stars 139 forks source link

fixing links for static pages #55

Closed dhfromkorea closed 9 years ago

dhfromkorea commented 9 years ago

Hi,

I am running a Buster 0.1.3 on OS X Maverick.

Problem: Buster's fixLinks function does not seem to correctly convert a relative link of a static page.

Example: I have a static page at "/project-portfolio".

When previewed or deployed by Buster, Buster converts "/project-portfolio" to "/project-portfolio.1". Notice the trailing integer with a dot.

Clarification

  1. This problem does not occur, when running locally at 2368 with Ghost.
  2. The page '/project-portfolio' DOES exist, when previewed or deployed by Buster.

Log Here's what I get when running "buster generate"

fixing links in /Users/dh/Dropbox/master_personal/code_dh/scripts/blogs/blog_kr/static/index.html ... project-portfolio.2 => project-portfolio.2 ...

dhfromkorea commented 9 years ago

UPDATE: It seems the problem occurs at the time page_name/index.html is generated by "wget" command. The trailing dot and integer are appended when there's a duplicate file name in the output path.

This will cause the canonical link in static/project-portfolio/index.html to be /project-portfolio.1.

dhfromkorea commented 9 years ago

Hi, I've found the reason:

There were two links which point to the same static page while one had a trailing slash and the other did not. Therefore wget command recognized one as a directory and the other as a file.

Ghost's local server instance considers the two to be the same.

If there are enough people getting caught up by this, for the sake of improving end-user experience, perhaps you could pre-gather the file paths, run a map function to add a trailing slash and run wget with the list.

Note that, for HTTP (and HTTPS), the trailing slash is very important to ‘--no-parent’. HTTP has no concept of a “directory”—Wget relies on you to indicate what’s a directory and what isn’t. In ‘http://foo/bar/’, Wget will consider ‘bar’ to be a directory, while in ‘http://foo/bar’ (no trailing slash), ‘bar’ will be considered a filename (so ‘--no-parent’ would be meaningless, as its parent is ‘/’). source: http://www.gnu.org/software/wget/manual/wget.html