gohugoio / hugo

The world’s fastest framework for building websites.
https://gohugo.io
Apache License 2.0
76.08k stars 7.54k forks source link

import from jekyll fails with "index out of range" panic #3895

Closed garfieldnate closed 6 years ago

garfieldnate commented 7 years ago

I'm trying to convert my jekyll site (source here: https://github.com/garfieldnate/garfieldnate.github.io). I use the following command:

hugo import jekyll jekyll_blog_path new_blog_path

Here's the output:

Importing...
panic: runtime error: index out of range

goroutine 1 [running]:
github.com/gohugoio/hugo/commands.replaceImageTag(0xc4203c338d, 0xec, 0x0, 0xc420692000)
    /private/tmp/hugo-20170913-45374-38zfps/hugo-0.27.1/src/github.com/gohugoio/hugo/commands/import_jekyll.go:588 +0x467
regexp.(*Regexp).ReplaceAllStringFunc.func1(0xc420692000, 0x90d, 0xa80, 0xc42021a140, 0x2, 0x2, 0x0, 0xa80, 0x0)
    /usr/local/Cellar/go/1.9/libexec/src/regexp/regexp.go:505 +0x8d
regexp.(*Regexp).replaceAll(0xc420688140, 0x0, 0x0, 0x0, 0xc4203c2a80, 0x9ff, 0x2, 0xc4206c7170, 0xa80, 0x120dbf0, ...)
    /usr/local/Cellar/go/1.9/libexec/src/regexp/regexp.go:543 +0x2eb
regexp.(*Regexp).ReplaceAllStringFunc(0xc420688140, 0xc4203c2a80, 0x9ff, 0x1805160, 0x13, 0xc4203c2a80)
    /usr/local/Cellar/go/1.9/libexec/src/regexp/regexp.go:504 +0xa9
github.com/gohugoio/hugo/commands.convertJekyllContent(0x17069c0, 0xc42042eff0, 0xc4203c2a80, 0x9ff, 0xc4203c2000, 0x9ff)
    /private/tmp/hugo-20170913-45374-38zfps/hugo-0.27.1/src/github.com/gohugoio/hugo/commands/import_jekyll.go:577 +0x4b7
github.com/gohugoio/hugo/commands.convertJekyllPost(0xc4200d8500, 0xc420551a00, 0x78, 0xc42001d280, 0x37, 0xc4202100a0, 0x43, 0xc4201f8a00, 0x2a, 0xc4202396e0)
    /private/tmp/hugo-20170913-45374-38zfps/hugo-0.27.1/src/github.com/gohugoio/hugo/commands/import_jekyll.go:463 +0x731
github.com/gohugoio/hugo/commands.importFromJekyll.func1(0xc420551a00, 0x78, 0x1bb2660, 0xc4202160d0, 0x0, 0x0, 0xc4202397f0, 0x109143b)
    /private/tmp/hugo-20170913-45374-38zfps/hugo-0.27.1/src/github.com/gohugoio/hugo/commands/import_jekyll.go:131 +0x253
github.com/gohugoio/hugo/vendor/github.com/spf13/afero.walk(0x1bb4380, 0x1c2e820, 0xc420551a00, 0x78, 0x1bb2660, 0xc4202160d0, 0xc420066140, 0x0, 0x78)
    /private/tmp/hugo-20170913-45374-38zfps/hugo-0.27.1/src/github.com/gohugoio/hugo/vendor/github.com/spf13/afero/path.go:44 +0x81
github.com/gohugoio/hugo/vendor/github.com/spf13/afero.Walk(0x1bb4380, 0x1c2e820, 0xc420551a00, 0x78, 0xc420066140, 0x0, 0x0)
    /private/tmp/hugo-20170913-45374-38zfps/hugo-0.27.1/src/github.com/gohugoio/hugo/vendor/github.com/spf13/afero/path.go:107 +0xef
github.com/gohugoio/hugo/helpers.SymbolicWalk(0x1bb4380, 0x1c2e820, 0xc4204075e0, 0x4d, 0xc420066140, 0x0, 0x0)
    /private/tmp/hugo-20170913-45374-38zfps/hugo-0.27.1/src/github.com/gohugoio/hugo/helpers/path.go:514 +0x2fe
github.com/gohugoio/hugo/commands.importFromJekyll(0x1c04e60, 0xc42000c600, 0x2, 0x2, 0x0, 0x0)
    /private/tmp/hugo-20170913-45374-38zfps/hugo-0.27.1/src/github.com/gohugoio/hugo/commands/import_jekyll.go:136 +0x58a
github.com/gohugoio/hugo/vendor/github.com/spf13/cobra.(*Command).execute(0x1c04e60, 0xc42000c560, 0x2, 0x2, 0x1c04e60, 0xc42000c560)
    /private/tmp/hugo-20170913-45374-38zfps/hugo-0.27.1/src/github.com/gohugoio/hugo/vendor/github.com/spf13/cobra/command.go:649 +0x456
github.com/gohugoio/hugo/vendor/github.com/spf13/cobra.(*Command).ExecuteC(0x1c049e0, 0x1805170, 0x17c6bb6, 0x5)
    /private/tmp/hugo-20170913-45374-38zfps/hugo-0.27.1/src/github.com/gohugoio/hugo/vendor/github.com/spf13/cobra/command.go:728 +0x2fe
github.com/gohugoio/hugo/commands.Execute()
    /private/tmp/hugo-20170913-45374-38zfps/hugo-0.27.1/src/github.com/gohugoio/hugo/commands/hugo.go:173 +0x60
main.main()
    /private/tmp/hugo-20170913-45374-38zfps/hugo-0.27.1/src/github.com/gohugoio/hugo/main.go:27 +0x36

Looking at the output, only the first post was converted. Since the error message indicates a problem with converting an image tag, perhaps the problem is this tag, found in the second post:

{% img center 208 58 http://japanese.stackexchange.com/users/flair/24.png "profile for Nate Glenn at Japanese Language and Usage, Q&A for students, teachers, and linguists wanting to discuss the finer points of the Japanese language" %}
garfieldnate commented 7 years ago

It looks like I mistakenly put the width and height in the wrong place in this image tag, so that jekyll interprets them as classes. It is not what I intended, but it is still a legal image tag.

The problem is that this regex assumes only one class, but jekyll's image syntax allows for multiple classes. You can see here that the regex does not match my image tag, but it does match if you remove the two numbers from the tag.

garfieldnate commented 7 years ago

Another problem with this regex is that it assumes that the class names will only consist of letters, which is definitely not always true. Class names often contain dashes or numbers. Apparently they can also contain dots and colons. I don't think it's useful to be strict about this anyway, because when using the converter I just want a conversion, not html syntax checking.

garfieldnate commented 7 years ago

To fix this specific problem, replace the regex on line 584 with the following regex:

{%\s+img\s*((?:[^\s/]+\s*)*)\s+([\S]*\/[\S]+)\s+(\d*)\s*(\d*)\s*(.*?)\s*%}

The class-matching portion matches 0 or more words (any character separated by white space). I also disallowed / in the class names to prevent catastrophic backtracking by avoiding matching the URL.

I have other doubts about this section of code, though. A URL doesn't have to have a slash in it, if the picture is in the same directory. Also, the section on parsing title and alt-title is overly simplistic. There could be several clitics (don't, didn't, etc.) in the text, and the author could be using double quotes instead of single quotes. A generic split is going to run into problems.

stale[bot] commented 6 years ago

This issue has been automatically marked as stale because it has not had recent activity. The resources of the Hugo team are limited, and so we are asking for your help. If this is a bug and you can still reproduce this error on the master branch, please reply with all of the information you have about it in order to keep the issue open. If this is a feature request, and you feel that it is still relevant and valuable, please tell us why. This issue will automatically be closed in the near future if no further activity occurs. Thank you for all your contributions.

github-actions[bot] commented 2 years ago

This issue has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs.