numist / numi.st

My public memory bank.
https://numi.st/
Other
1 stars 2 forks source link

Fetch link titles during build to populate `a[title]` #63

Open numist opened 1 year ago

numist commented 1 year ago

Javascript can't cross origins, but Ruby sure can.

Populating a[title] at build-time with the titles of the link targets would significantly help with accessibility. Bonus points if it updates the Markdown (only in dev?) so the results can be committed.

This infrastructure could also be a good foundation for validating external links in our neverending battle against bit rot.

numist commented 1 year ago

Doing this on every Jekyll build might be a bit much, especially if it also checks external links for liveness; maybe this is a good candidate for a script? Then it could:

numist commented 4 months ago

Restricting to Jekyll.env == "development" (which is read from JEKYLL_ENV) is probably good enough.

Possible starting point:

require 'nokogiri'
require 'open-uri'

Jekyll::Hooks.register :site, :after_reset do |site|
  return unless Jekyll.env == "development"  # Run only in development environment

  site.pages.each do |page|
    next unless page.path.end_with?(".md")

    filename = File.join(site.source, page.path)
    content = File.read(filename)
    updated_content = content.gsub(/\[([^\]]+)\]\((http[^)"\s]+)\)/) do |match|
      text = $1
      url = $2
      title = fetch_title(url)
      title ? "[#{text}](#{url} \"#{title}\")" : match
    end

    # Write changes back to the disk only if there were changes
    File.write(filename, updated_content) if content != updated_content
  end
end

def fetch_title(url)
  doc = Nokogiri::HTML(URI.open(url))
  doc.at('title')&.text.strip
rescue
  nil
end