rubensworks / ScholarMarkdown

A framework for writing markdown-based scholarly articles.
MIT License
41 stars 9 forks source link

the acronyms filter also replaces <img alt=""> #12

Open bjdmeest opened 6 years ago

bjdmeest commented 6 years ago

If I have, e.g.

UI,User Interface

in acronyms.csv, and somewhere add

<img src="img/ui.jpg" alt="The UI">

this results in very nasty HTML

<img src="img/ui.jpg" alt="The <span class='abbreviation' title='User Interface'>UI</span>&#8221; />

Are you saying you're not parsing the HTML, but doing regular expressions?

bjdmeest commented 6 years ago

Suggestion: https://stackoverflow.com/questions/7234292/modifying-text-inside-html-nodes-nokogiri

rubensworks commented 6 years ago

of Man ALL IS LOŚ͖̩͇̗̪̏̈́T ALL I​S LOST the pon̷y he comes he c̶̮omes he comes the ich​or permeates all MY FACE MY FACE ᵒh god no NO NOO̼O​O NΘ stop the an​*̶͑̾̾​̅ͫ͏̙̤g͇̫͛͆̾ͫ̑͆l͖͉̗̩̳̟̍ͫͥͨe̠̅s ͎a̧͈͖r̽̾̈́͒͑e n​ot rè̑ͧ̌aͨl̘̝̙̃ͤ͂̾̆ ZA̡͊͠͝LGΌ ISͮ̂҉̯͈͕̹̘̱ TO͇̹̺ͅƝ̴ȳ̳ TH̘Ë͖́̉ ͠P̯͍̭O̚​N̐Y̡ H̸̡̪̯ͨ͊̽̅̾̎Ȩ̬̩̾͛ͪ̈́̀́͘ ̶̧̨̱̹̭̯ͧ̾ͬC̷̙̲̝͖ͭ̏ͥͮ͟Oͮ͏̮̪̝͍M̲̖͊̒ͪͩͬ̚̚͜Ȇ̴̟̟͙̞ͩ͌͝S̨̥̫͎̭ͯ̿̔̀ͅ

Yep, this is what happened, my bad...

Until this is fixed, you could lowercase your acronyms that shouldn't be escaped (that's what I do).

bjdmeest commented 6 years ago

give me some minutes, might do a pull request ;)

bjdmeest commented 6 years ago

:( Got close but not quite there yet (also incredibly inefficient)

require 'csv'

class ScholarAcronymFilter < Nanoc::Filter
  requires 'nokogiri'

  identifier :scholar_acronym
  type :text

  def run(content, params = {})
    doc = Nokogiri::HTML(content)
    acronyms = CSV.parse(params[:acronyms].raw_content, :headers => true)

    doc.traverse do |x|
      if x.text?
        acronyms.each do |row|
          x.inner_html = x.content.gsub %r{(?<=[^a-zA-Z0-9])#{row['abbreviation']}(?=[^a-zA-Z0-9])} do |match|
            %{<span class="abbreviation" title="#{row['full']}">#{row['abbreviation']}</span>}
          end
        end
      end
    end

    doc.at('body').children.to_html
  end
end

I'm not ruby-savvy enough to fix this quickly, will manage for now, and putting this code here for if I find some more time ;)