jekyll / jekyll-import

:inbox_tray: The "jekyll import" command for importing from various blogs to Jekyll format.
https://import.jekyllrb.com
MIT License
512 stars 315 forks source link

Fix Tumblr import #548

Open dmdeller opened 3 weeks ago

dmdeller commented 3 weeks ago

I was trying to run the Tumblr importer under Ruby 3 and the latest Jekyll, and I ran into these two problems, when using the option --rewrite_urls true:

  1. There was a call to URI.encode() which was removed in Ruby 3. This causes an uncaught exception. This PR fixes this by replacing it with URI::Parser.new.escape().

  2. The redirect URLs generated were incorrect, which caused a 404 when a user tries to follow one of the old Tumblr-style links.

(Neither problem occurs when not using that option. I needed that option, though.)

It was generating URLs like this: /2024/08/22/2019-04-03-how-i-handle-errors-in-ios-apps.html (Today's date is incorrectly used as the directory path, and then the actual original date of the post is incorrectly inserted into the slug)

It should have been generating URLs like this: /2019/04/03/how-i-handle-errors-in-ios-apps.html (Correctly matches where the imported post ends up after jekyll build)

The second issue was somewhat trickier to fix. I'm not familiar with the Jekyll::Document API. I couldn't figure out why it returns the wrong result. Instead, I resorted to just assembling the string manually (which required passing a little more data to rewrite_urls_and_redirects() than was being done previously).

I'm open to feedback for how to improve this further. Thanks for taking a look!