Open numist opened 1 year ago
Doing this on every Jekyll build might be a bit much, especially if it also checks external links for liveness; maybe this is a good candidate for a script? Then it could:
Restricting to Jekyll.env == "development"
(which is read from JEKYLL_ENV
) is probably good enough.
Possible starting point:
require 'nokogiri'
require 'open-uri'
Jekyll::Hooks.register :site, :after_reset do |site|
return unless Jekyll.env == "development" # Run only in development environment
site.pages.each do |page|
next unless page.path.end_with?(".md")
filename = File.join(site.source, page.path)
content = File.read(filename)
updated_content = content.gsub(/\[([^\]]+)\]\((http[^)"\s]+)\)/) do |match|
text = $1
url = $2
title = fetch_title(url)
title ? "[#{text}](#{url} \"#{title}\")" : match
end
# Write changes back to the disk only if there were changes
File.write(filename, updated_content) if content != updated_content
end
end
def fetch_title(url)
doc = Nokogiri::HTML(URI.open(url))
doc.at('title')&.text.strip
rescue
nil
end
Javascript can't cross origins, but Ruby sure can.
Populating
a[title]
at build-time with the titles of the link targets would significantly help with accessibility. Bonus points if it updates the Markdown (only in dev?) so the results can be committed.This infrastructure could also be a good foundation for validating external links in our neverending battle against bit rot.