jekyll / jekyll-sitemap

Jekyll plugin to silently generate a sitemaps.org compliant sitemap for your Jekyll site
http://rubygems.org/gems/jekyll-sitemap
MIT License
949 stars 134 forks source link

Liquid Exception: Liquid error (line 13): invalid byte sequence in UTF-8 in sitemap.xml #231

Closed kunalnagar closed 5 years ago

kunalnagar commented 5 years ago

When I try to build without the jekyll-sitemap extension, it works fine.

$ jekyll server --livereload -t
Configuration file: C:/Users/Kunal Nagar/Documents/Code/personal-main/_config.yml
            Source: C:/Users/Kunal Nagar/Documents/Code/personal-main
       Destination: ./public
 Incremental build: disabled. Enable with --incremental
      Generating...
       Jekyll Feed: Generating feed for posts
  Liquid Exception: Liquid error (line 13): invalid byte sequence in UTF-8 in sitemap.xml
C:/ruby-2.3.3-x64-mingw32/lib/ruby/gems/2.3.0/gems/addressable-2.5.2/lib/addressable/uri.rb:440:in `gsub': Liquid error (line 13): invalid byte sequence in UTF-8 (Liquid::ArgumentError)
        from C:/ruby-2.3.3-x64-mingw32/lib/ruby/gems/2.3.0/gems/addressable-2.5.2/lib/addressable/uri.rb:440:in `unencode'
        from C:/ruby-2.3.3-x64-mingw32/lib/ruby/gems/2.3.0/gems/addressable-2.5.2/lib/addressable/uri.rb:536:in `normalize_component'
        from C:/ruby-2.3.3-x64-mingw32/lib/ruby/gems/2.3.0/gems/addressable-2.5.2/lib/addressable/uri.rb:1502:in `block in normalized_path'
        from C:/ruby-2.3.3-x64-mingw32/lib/ruby/gems/2.3.0/gems/addressable-2.5.2/lib/addressable/uri.rb:1501:in `map'
        from C:/ruby-2.3.3-x64-mingw32/lib/ruby/gems/2.3.0/gems/addressable-2.5.2/lib/addressable/uri.rb:1501:in `normalized_path'
        from C:/ruby-2.3.3-x64-mingw32/lib/ruby/gems/2.3.0/gems/addressable-2.5.2/lib/addressable/uri.rb:2134:in `normalize'
        from C:/ruby-2.3.3-x64-mingw32/lib/ruby/gems/2.3.0/gems/jekyll-3.8.5/lib/jekyll/filters/url_filters.rb:36:in `relative_url'
        from C:/ruby-2.3.3-x64-mingw32/lib/ruby/gems/2.3.0/gems/jekyll-3.8.5/lib/jekyll/filters/url_filters.rb:18:in `absolute_url'
        from C:/ruby-2.3.3-x64-mingw32/lib/ruby/gems/2.3.0/gems/liquid-4.0.1/lib/liquid/strainer.rb:56:in `invoke'
        from C:/ruby-2.3.3-x64-mingw32/lib/ruby/gems/2.3.0/gems/liquid-4.0.1/lib/liquid/context.rb:86:in `invoke'
        from C:/ruby-2.3.3-x64-mingw32/lib/ruby/gems/2.3.0/gems/liquid-4.0.1/lib/liquid/variable.rb:84:in `block in render'
        from C:/ruby-2.3.3-x64-mingw32/lib/ruby/gems/2.3.0/gems/liquid-4.0.1/lib/liquid/variable.rb:82:in `each'
        from C:/ruby-2.3.3-x64-mingw32/lib/ruby/gems/2.3.0/gems/liquid-4.0.1/lib/liquid/variable.rb:82:in `inject'
        from C:/ruby-2.3.3-x64-mingw32/lib/ruby/gems/2.3.0/gems/liquid-4.0.1/lib/liquid/variable.rb:82:in `render'
        from C:/ruby-2.3.3-x64-mingw32/lib/ruby/gems/2.3.0/gems/liquid-4.0.1/lib/liquid/block_body.rb:102:in `render_node_to_output'
        from C:/ruby-2.3.3-x64-mingw32/lib/ruby/gems/2.3.0/gems/liquid-4.0.1/lib/liquid/block_body.rb:80:in `render'
        from C:/ruby-2.3.3-x64-mingw32/lib/ruby/gems/2.3.0/gems/liquid-4.0.1/lib/liquid/tags/for.rb:161:in `block (2 levels) in render_segment'
        from C:/ruby-2.3.3-x64-mingw32/lib/ruby/gems/2.3.0/gems/liquid-4.0.1/lib/liquid/tags/for.rb:159:in `each'
        from C:/ruby-2.3.3-x64-mingw32/lib/ruby/gems/2.3.0/gems/liquid-4.0.1/lib/liquid/tags/for.rb:159:in `block in render_segment'
        from C:/ruby-2.3.3-x64-mingw32/lib/ruby/gems/2.3.0/gems/liquid-4.0.1/lib/liquid/context.rb:123:in `stack'
        from C:/ruby-2.3.3-x64-mingw32/lib/ruby/gems/2.3.0/gems/liquid-4.0.1/lib/liquid/tags/for.rb:151:in `render_segment'
        from C:/ruby-2.3.3-x64-mingw32/lib/ruby/gems/2.3.0/gems/liquid-4.0.1/lib/liquid/tags/for.rb:80:in `render'
        from C:/ruby-2.3.3-x64-mingw32/lib/ruby/gems/2.3.0/gems/liquid-4.0.1/lib/liquid/block_body.rb:102:in `render_node_to_output'
        from C:/ruby-2.3.3-x64-mingw32/lib/ruby/gems/2.3.0/gems/liquid-4.0.1/lib/liquid/block_body.rb:82:in `render'
        from C:/ruby-2.3.3-x64-mingw32/lib/ruby/gems/2.3.0/gems/liquid-4.0.1/lib/liquid/template.rb:208:in `block in render'
        from C:/ruby-2.3.3-x64-mingw32/lib/ruby/gems/2.3.0/gems/liquid-4.0.1/lib/liquid/template.rb:242:in `with_profiling'
        from C:/ruby-2.3.3-x64-mingw32/lib/ruby/gems/2.3.0/gems/liquid-4.0.1/lib/liquid/template.rb:207:in `render'
        from C:/ruby-2.3.3-x64-mingw32/lib/ruby/gems/2.3.0/gems/liquid-4.0.1/lib/liquid/template.rb:220:in `render!'
        from C:/ruby-2.3.3-x64-mingw32/lib/ruby/gems/2.3.0/gems/jekyll-3.8.5/lib/jekyll/liquid_renderer/file.rb:30:in `block (2 levels) in render!'
        from C:/ruby-2.3.3-x64-mingw32/lib/ruby/gems/2.3.0/gems/jekyll-3.8.5/lib/jekyll/liquid_renderer/file.rb:42:in `measure_bytes'
        from C:/ruby-2.3.3-x64-mingw32/lib/ruby/gems/2.3.0/gems/jekyll-3.8.5/lib/jekyll/liquid_renderer/file.rb:29:in `block in render!'
        from C:/ruby-2.3.3-x64-mingw32/lib/ruby/gems/2.3.0/gems/jekyll-3.8.5/lib/jekyll/liquid_renderer/file.rb:49:in `measure_time'
        from C:/ruby-2.3.3-x64-mingw32/lib/ruby/gems/2.3.0/gems/jekyll-3.8.5/lib/jekyll/liquid_renderer/file.rb:28:in `render!'
        from C:/ruby-2.3.3-x64-mingw32/lib/ruby/gems/2.3.0/gems/jekyll-3.8.5/lib/jekyll/renderer.rb:126:in `render_liquid'
        from C:/ruby-2.3.3-x64-mingw32/lib/ruby/gems/2.3.0/gems/jekyll-3.8.5/lib/jekyll/renderer.rb:79:in `render_document'
        from C:/ruby-2.3.3-x64-mingw32/lib/ruby/gems/2.3.0/gems/jekyll-3.8.5/lib/jekyll/renderer.rb:62:in `run'
        from C:/ruby-2.3.3-x64-mingw32/lib/ruby/gems/2.3.0/gems/jekyll-3.8.5/lib/jekyll/site.rb:479:in `render_regenerated'
        from C:/ruby-2.3.3-x64-mingw32/lib/ruby/gems/2.3.0/gems/jekyll-3.8.5/lib/jekyll/site.rb:472:in `block in render_pages'
        from C:/ruby-2.3.3-x64-mingw32/lib/ruby/gems/2.3.0/gems/jekyll-3.8.5/lib/jekyll/site.rb:471:in `each'
        from C:/ruby-2.3.3-x64-mingw32/lib/ruby/gems/2.3.0/gems/jekyll-3.8.5/lib/jekyll/site.rb:471:in `render_pages'
        from C:/ruby-2.3.3-x64-mingw32/lib/ruby/gems/2.3.0/gems/jekyll-3.8.5/lib/jekyll/site.rb:192:in `render'
        from C:/ruby-2.3.3-x64-mingw32/lib/ruby/gems/2.3.0/gems/jekyll-3.8.5/lib/jekyll/site.rb:71:in `process'
        from C:/ruby-2.3.3-x64-mingw32/lib/ruby/gems/2.3.0/gems/jekyll-3.8.5/lib/jekyll/command.rb:28:in `process_site'
        from C:/ruby-2.3.3-x64-mingw32/lib/ruby/gems/2.3.0/gems/jekyll-3.8.5/lib/jekyll/commands/build.rb:65:in `build'
        from C:/ruby-2.3.3-x64-mingw32/lib/ruby/gems/2.3.0/gems/jekyll-3.8.5/lib/jekyll/commands/build.rb:36:in `process'
        from C:/ruby-2.3.3-x64-mingw32/lib/ruby/gems/2.3.0/gems/jekyll-3.8.5/lib/jekyll/commands/serve.rb:93:in `block in start'
        from C:/ruby-2.3.3-x64-mingw32/lib/ruby/gems/2.3.0/gems/jekyll-3.8.5/lib/jekyll/commands/serve.rb:93:in `each'
        from C:/ruby-2.3.3-x64-mingw32/lib/ruby/gems/2.3.0/gems/jekyll-3.8.5/lib/jekyll/commands/serve.rb:93:in `start'
        from C:/ruby-2.3.3-x64-mingw32/lib/ruby/gems/2.3.0/gems/jekyll-3.8.5/lib/jekyll/commands/serve.rb:75:in `block (2 levels) in init_with_program'
        from C:/ruby-2.3.3-x64-mingw32/lib/ruby/gems/2.3.0/gems/mercenary-0.3.6/lib/mercenary/command.rb:220:in `block in execute'
        from C:/ruby-2.3.3-x64-mingw32/lib/ruby/gems/2.3.0/gems/mercenary-0.3.6/lib/mercenary/command.rb:220:in `each'
        from C:/ruby-2.3.3-x64-mingw32/lib/ruby/gems/2.3.0/gems/mercenary-0.3.6/lib/mercenary/command.rb:220:in `execute'
        from C:/ruby-2.3.3-x64-mingw32/lib/ruby/gems/2.3.0/gems/mercenary-0.3.6/lib/mercenary/program.rb:42:in `go'
        from C:/ruby-2.3.3-x64-mingw32/lib/ruby/gems/2.3.0/gems/mercenary-0.3.6/lib/mercenary.rb:19:in `program'
        from C:/ruby-2.3.3-x64-mingw32/lib/ruby/gems/2.3.0/gems/jekyll-3.8.5/exe/jekyll:15:in `<top (required)>'
        from C:/ruby-2.3.3-x64-mingw32/bin/jekyll:22:in `load'
        from C:/ruby-2.3.3-x64-mingw32/bin/jekyll:22:in `<main>'
ashmaroli commented 5 years ago

@kunalnagar We'll need a sample repository to reproduce this issue.

kunalnagar commented 5 years ago

@ashmaroli what do you need in the sample repo?

The stack trace above is from my private repo on Bitbucket that contains my website code.

ashmaroli commented 5 years ago

what do you need in the sample repo?

The test repo would contain a Gemfile, a minimal config file and all folders and files named with non-latin characters.

kunalnagar commented 5 years ago

@ashmaroli as you can imagine that's a little tedious, so I generated a tree structure for you and I've given you the Gemfile and config. Let me know if this helps.

kunalnagar.in on  master via ⬢ v9.11.1
➜ tree -I 'node_modules|public|projects'
.
├── 404.html
├── Gemfile
├── Gemfile.lock
├── Todo.todo
├── _config.yml
├── _includes
│   ├── footer.html
│   └── header.html
├── _layouts
│   ├── default.html
│   ├── post.html
│   └── project.html
├── _posts
│   ├── 2015-01-08-deploying-laravel-5-on-godaddy-shared-hosting.md
│   ├── 2015-02-11-fix-google-drive-dark-mode-icon.md
│   ├── 2015-02-13-setup-multiple-ssh-keys-mac.md
│   ├── 2015-04-14-fix-rdsrv-malware-vsearch-downlite.md
│   ├── 2015-04-14-install-laravel-5-on-os-x.md
│   ├── 2015-04-21-ionic-push-notifications.md
│   ├── 2015-04-22-custom-404-pro.md
│   ├── 2015-10-18-chrome-extension-youtube-jukebox.md
│   ├── 2016-02-14-mac-bartender-os-x-menu-bar-minimalism.md
│   ├── 2016-08-24-tripmode-mobile-data-savior.md
│   ├── 2016-09-03-github-style-copy-path-zeroclipboard.md
│   ├── 2016-09-10-shared-modules-using-requirejs.md
│   ├── 2016-09-17-rewrite-wordpress-multisites.md
│   ├── 2017-06-24-act-fibernet-horrible-experience.md
│   ├── 2018-02-24-youtube-to-mp3-converter.md
│   ├── 2018-10-13-block-ads-skype.md
│   ├── 2018-10-19-sync-1password-local.md
│   └── 2018-11-06-setup-codeship-digital-ocean.md
├── _projects
│   ├── 2015-05-01-viraltag-ios-mobile-app.md
│   ├── 2015-05-02-viraltag-single-post-scheduler.md
│   ├── 2015-05-03-viraltag-bulk-post-scheduler.md
│   ├── 2015-05-04-viraltag-post-collaboration.md
│   ├── 2015-05-05-viraltag-tapit-instagram-shops.md
│   ├── 2015-05-06-viraltag-ugc-galleries.md
│   ├── 2015-05-07-viraltag-website-redesign.md
│   ├── 2015-05-08-viraltag-chrome-extension.md
│   ├── 2016-11-01-scalegrid-website-redesign-01.md
│   ├── 2016-11-02-scalegrid-blog-redesign-01.md
│   └── 2016-11-03-scalegrid-dashboard-redesign-01.md
├── _sass
│   ├── _variables.scss
│   └── layouts
│       └── _layout.scss
├── _site
│   └── package-lock.json
├── assets
│   ├── css
│   │   └── base.scss
│   ├── downloads
│   │   ├── 2018-10-13-block-ads-skype
│   │   │   └── hosts-skype.txt
│   │   ├── Résumé\ -\ Kunal\ Nagar.pdf
│   │   └── siteblock.txt
│   └── img
│       ├── logo-simple.svg
│       └── post.png
├── blog
│   └── index.html
├── favicon.ico
├── foss.html
├── index.html
└── package-lock.json

13 directories, 53 files

Here's the Gemfile:

source "https://rubygems.org"

# Hello! This is where you manage which Jekyll version is used to run.
# When you want to use a different version, change it below, save the
# file and run `bundle install`. Run Jekyll with `bundle exec`, like so:
#
#     bundle exec jekyll serve
#
# This will help ensure the proper Jekyll version is running.
# Happy Jekylling!
gem "jekyll", "~> 3.8.3"

# This is the default theme for new Jekyll sites. You may change this to anything you like.
# gem "minima", "~> 2.0"

# If you want to use GitHub Pages, remove the "gem "jekyll"" above and
# uncomment the line below. To upgrade, run `bundle update github-pages`.
# gem "github-pages", group: :jekyll_plugins

# If you have any plugins, put them here!
group :jekyll_plugins do
  gem "jekyll-feed", "~> 0.6"
  gem 'jekyll-sitemap'
  gem 'jekyll-watch'
end

# Windows does not include zoneinfo files, so bundle the tzinfo-data gem
gem "tzinfo-data", platforms: [:mingw, :mswin, :x64_mingw, :jruby]

# Performance-booster for watching directories on Windows
gem "wdm", "~> 0.1.0" if Gem.win_platform?

And here's my config:

# Welcome to Jekyll!
#
# This config file is meant for settings that affect your whole blog, values
# which you are expected to set up once and rarely edit after that. If you find
# yourself editing this file very often, consider using Jekyll's data files
# feature for the data you need to update frequently.
#
# For technical reasons, this file is *NOT* reloaded automatically when you use
# 'bundle exec jekyll serve'. If you change this file, please restart the server process.

# Site settings
# These are used to personalize your new site. If you look in the HTML files,
# you will see them accessed via {{ site.title }}, {{ site.email }}, and so on.
# You can create any custom variable you would like, and they will be accessible
# in the templates via {{ site.myvariable }}.
title: Kunal Nagar
email: <REDACTED>
description: >- # this means to ignore newlines until "baseurl:"
  <REDACTED>
baseurl: "" # the subpath of your site, e.g. /blog
url: "https://kunalnagar.in" # the base hostname & protocol for your site, e.g. http://example.com
twitter_username: <READACTED>
github_username:  kunalnagar

include: ['.htaccess']

permalink: pretty
destination: ./public

collections:
    projects:
        output: false

# Build settings
markdown: kramdown
# theme: minima
plugins:
  - jekyll-feed
  - jekyll-sitemap

sass:
    style: compressed

# Exclude from processing.
# The following items will not be processed, by default. Create a custom list
# to override the default setting.
# exclude:
#   - Gemfile
#   - Gemfile.lock
#   - node_modules
#   - vendor/bundle/
#   - vendor/cache/
#   - vendor/gems/
#   - vendor/ruby/
ashmaroli commented 5 years ago

@kunalnagar Thank you for posting the tree structure. From that the culprit here is none other than assets/downloads/Résumé - Kunal Nagar.pdf

Renaming that file to a simpler basename should resolve this issue for you. However, I'm not closing this ticket just yet because I'm not sure if the root cause is within this plugin's template or within Jekyll itself..

kunalnagar commented 5 years ago

@ashmaroli I did a quick test - cloned the repo on a mac and it works without any issues. Not sure that the issue is on Windows. Here are the versions on my mac:

kunalnagar commented 5 years ago

@ashmaroli quick update - looks like the resume name was the culprit on Windows. So here's where we're at: the original name with spaces and accent characters works fine on mac but not Windows, and if I rename the file to plain English with underscores instead of spaces, it works fine everywhere.

I'm curious to know why that is - if you have any insight on this, please let me know. Thanks for all your help!

ashmaroli commented 5 years ago

I'm curious to know why that is

I'm not entirely sure. But the reason could be that Résumé encoded as #<Encoding:Windows-1252> on Windows gets converted into R\xE9sum\xE9 when the string is forced to #<Encoding:UTF-8>

kunalnagar commented 5 years ago

@ashmaroli interesting. So would you like to keep this issue open for further investigation?

ashmaroli commented 5 years ago

would you like to keep this issue open..?

Yes, for inputs from the maintainers here.. /cc @jekyll/plugin-core

jekyllbot commented 5 years ago

This issue has been automatically marked as stale because it has not been commented on for at least two months.

The resources of the Jekyll team are limited, and so we are asking for your help.

If this is a bug and you can still reproduce this error on the master branch, please reply with all of the information you have about it in order to keep the issue open.

If this is a feature request, please consider whether it can be accomplished in another way. If it cannot, please elaborate on why it is core to this project and why you feel more than 80% of users would find this beneficial.

This issue will automatically be closed in two months if no further activity occurs. Thank you for all your contributions.