cryogen-project / cryogen-core

Cryogen's core
Eclipse Public License 1.0
69 stars 62 forks source link

Add header block for Google's 'Rich Snippets' nonsense #165

Open simon-brooke opened 1 year ago

simon-brooke commented 1 year ago

Not content with or News Industry Text Format, or Facebook/Meta's 'Open Graph' syntax to generate pretty banners for URLs, or Twitter/X's similar-but-different mechanism, or any other existing standard, Google have had to reinvent the wheel in a completely different way. Because of course they do.

And while I hate this perverse 'not invented here' proliferation of competing standards all doing more or less the same thing and all adding bloat to web pages, given that Cryogen already does support both the Facebook and Twitter standards (and I'm largely responsible for this), it seems foolish not to add the Google version as well.

Therefore I propose to do so unless other people object; expect a pull request shortly.

yogthos commented 1 year ago

Sounds reasonable, I agree that a bunch of competing standards doing the same thing is annoying, but it is the world we live in unfortunately. :)

simon-brooke commented 1 year ago

OK, this issue turns out really to relate to cryogen/lein-template, and I'll issue you a pull request for that shortly. However, there's a related issue that I've attempted to fix and failed. When cryogen-core is compiling pages, it replaces 'unsafe' characters in e.g. titles with HTML entities – as it should. But, each of the Facebook, Twitter and Google implementations of banner-generation stuff expects not to find HTML entities in the text passed to them, and if they do find entities, they don't correctly handle them. You can see this here.

So in passing post titles and image descriptions and so on to these things, I need to expand entities into UTF-8. I added the following to cryogen.core/compiler:


(defn expand-entities
  "Expand HTML entities in string `s`, provided that it does not appear to
   have embedded markup. If the argument appears to have embedded markup,
   it will be returned unmodified"
  [^String s]
  (if-not (re-seq #"[\\<\\>]" s)
    (or (first (:content (last (:content (cr/parse-string s))))) s)
    s))                 

(add-filter! :expand-entities expand-entities)

(where cr is crouton)), and the function works as you'd expect:

user=> (use 'cryogen-core.compiler :reload)
nil
user=> (expand-entities "&amp;&#39;")
"&'"

However, the filter doesn't work:

user=> (require '[selmer.filters :refer [add-filter!]])
nil
user=> (add-filter! :expand-entities expand-entities)
{:number-format #object[selmer.filters$eval9689$fn__9691 0x43166c29 "selmer.filters$eval9689$fn__9691@43166c29"], :divide #object[selmer.filters$eval9689$fn__9696 0x56f9f71f "selmer.filters$eval9689$fn__9696@56f9f71f"], :email #object[selmer.filters$eval9689$fn__9699 0x355e35e "selmer.filters$eval9689$fn__9699@355e35e"], :upper #object[selmer.filters$eval9689$fn__9706 0x53254be7 "selmer.filters$eval9689$fn__9706@53254be7"], :date #object[selmer.filters$eval9689$fn__9709 0x19035ebd "selmer.filters$eval9689$fn__9709@19035ebd"], :remove #object[selmer.filters$eval9689$fn__9715 0x423a171a "selmer.filters$eval9689$fn__9715@423a171a"], :between? #object[selmer.filters$eval9689$fn__9717 0x74387797 "selmer.filters$eval9689$fn__9717@74387797"], :empty? #object[clojure.core$empty_QMARK_ 0x152d8b12 "clojure.core$empty_QMARK_@152d8b12"], :hash #object[selmer.filters$eval9689$fn__9719 0x3c06d4fd "selmer.filters$eval9689$fn__9719@3c06d4fd"], :count-is #object[selmer.filters$eval9689$fn__9722 0x2cafe5f6 "selmer.filters$eval9689$fn__9722@2cafe5f6"], :abbr-left #object[selmer.filters$eval9689$fn__9724 0x160a98e "selmer.filters$eval9689$fn__9724@160a98e"], :replace #object[selmer.filters$eval9689$fn__9726 0x1b530503 "selmer.filters$eval9689$fn__9726@1b530503"], :phone #object[selmer.filters$eval9689$fn__9729 0x4dcb9db1 "selmer.filters$eval9689$fn__9729@4dcb9db1"], :default #object[selmer.filters$eval9689$fn__9741 0xb06616 "selmer.filters$eval9689$fn__9741@b06616"], :remove-tags #object[selmer.filters$eval9689$fn__9744 0x43b8831e "selmer.filters$eval9689$fn__9744@43b8831e"], :currency-format #object[selmer.filters$eval9689$fn__9747 0x3b31ec6 "selmer.filters$eval9689$fn__9747@3b31ec6"], :abbr-middle #object[selmer.filters$eval9689$fn__9753 0x2b63f25c "selmer.filters$eval9689$fn__9753@2b63f25c"], :abbr-ellipsis #object[selmer.filters$eval9689$fn__9755 0x324fb122 "selmer.filters$eval9689$fn__9755@324fb122"], :name #object[clojure.core$name 0xb66de14 "clojure.core$name@b66de14"], :drop #object[selmer.filters$eval9689$fn__9757 0x63265c19 "selmer.filters$eval9689$fn__9757@63265c19"], :urlescape #object[selmer.filters$eval9689$fn__9759 0x785d0c4d "selmer.filters$eval9689$fn__9759@785d0c4d"], :length-is #object[selmer.filters$eval9689$fn__9761 0x7c40cdb8 "selmer.filters$eval9689$fn__9761@7c40cdb8"], :linebreaks-br #object[selmer.filters$eval9689$fn__9763 0x1e0fbef0 "selmer.filters$eval9689$fn__9763@1e0fbef0"], :addslashes #object[selmer.filters$eval9689$fn__9765 0x754aed0a "selmer.filters$eval9689$fn__9765@754aed0a"], :title #object[selmer.filters$eval9689$fn__9770 0x1f58a664 "selmer.filters$eval9689$fn__9770@1f58a664"], :drop-last #object[selmer.filters$eval9689$fn__9772 0x5cc67610 "selmer.filters$eval9689$fn__9772@5cc67610"], :not-empty #object[clojure.core$not_empty 0x61aad2c4 "clojure.core$not_empty@61aad2c4"], :center #object[selmer.filters$eval9689$fn__9774 0x478f30e4 "selmer.filters$eval9689$fn__9774@478f30e4"], :round #object[selmer.filters$eval9689$fn__9776 0x123f2bc0 "selmer.filters$eval9689$fn__9776@123f2bc0"], :abbreviate #object[selmer.filters$eval9689$abbreviate__9778 0x31aa2ba5 "selmer.filters$eval9689$abbreviate__9778@31aa2ba5"], :pluralize #object[selmer.filters$eval9689$fn__9781 0x581caeb8 "selmer.filters$eval9689$fn__9781@581caeb8"], :get-digit #object[selmer.filters$eval9689$fn__9785 0x56ecec52 "selmer.filters$eval9689$fn__9785@56ecec52"], :str #object[clojure.core$str 0x6f3b16df "clojure.core$str@6f3b16df"], :lower #object[selmer.filters$eval9689$fn__9788 0x30e0b540 "selmer.filters$eval9689$fn__9788@30e0b540"], :count #object[selmer.filters$eval9689$fn__9790 0x2dc5ddcd "selmer.filters$eval9689$fn__9790@2dc5ddcd"], :linenumbers #object[selmer.filters$eval9689$fn__9792 0x63a4b7f1 "selmer.filters$eval9689$fn__9792@63a4b7f1"], :length #object[selmer.filters$eval9689$fn__9796 0x7790261 "selmer.filters$eval9689$fn__9796@7790261"], :default-if-empty #object[selmer.filters$eval9689$fn__9798 0x4d654042 "selmer.filters$eval9689$fn__9798@4d654042"], :sort-by-reversed #object[selmer.filters$eval9689$fn__9800 0xafd80b9 "selmer.filters$eval9689$fn__9800@afd80b9"], :multiply #object[selmer.filters$eval9689$fn__9802 0x25a20476 "selmer.filters$eval9689$fn__9802@25a20476"], :sort-reversed #object[selmer.filters$eval9689$fn__9804 0x41a59f8d "selmer.filters$eval9689$fn__9804@41a59f8d"], :subs #object[selmer.filters$eval9689$fn__9806 0x27b9e031 "selmer.filters$eval9689$fn__9806@27b9e031"], :sort-by #object[selmer.filters$eval9689$fn__9809 0x349445e0 "selmer.filters$eval9689$fn__9809@349445e0"], :first #object[selmer.filters$eval9689$fn__9811 0x20086dd5 "selmer.filters$eval9689$fn__9811@20086dd5"], :json #object[selmer.filters$eval9689$fn__9813 0x68dead11 "selmer.filters$eval9689$fn__9813@68dead11"], :safe #object[selmer.filters$eval9689$fn__9815 0x6aaefb4 "selmer.filters$eval9689$fn__9815@6aaefb4"], :linebreaks #object[selmer.filters$eval9689$fn__9817 0x7ff57cf8 "selmer.filters$eval9689$fn__9817@7ff57cf8"], :add #object[selmer.filters$eval9689$fn__9819 0x31c3802c "selmer.filters$eval9689$fn__9819@31c3802c"], :abbr-right #object[selmer.filters$eval9689$fn__9821 0x7052251e "selmer.filters$eval9689$fn__9821@7052251e"], :last #object[selmer.filters$eval9689$fn__9823 0x151601f4 "selmer.filters$eval9689$fn__9823@151601f4"], :expand-entities #object[cryogen_core.compiler$expand_entities 0x74679b35 "cryogen_core.compiler$expand_entities@74679b35"], :take #object[selmer.filters$eval9689$fn__9825 0x2072796e "selmer.filters$eval9689$fn__9825@2072796e"], :double-format #object[selmer.filters$eval9689$fn__9828 0x1cbe9a00 "selmer.filters$eval9689$fn__9828@1cbe9a00"], :capitalize #object[selmer.filters$eval9689$fn__9833 0x5c8537c2 "selmer.filters$eval9689$fn__9833@5c8537c2"], :sort #object[selmer.filters$eval9689$fn__9835 0x20338034 "selmer.filters$eval9689$fn__9835@20338034"], :rand-nth #object[selmer.filters$eval9689$fn__9837 0x46b3e9c6 "selmer.filters$eval9689$fn__9837@46b3e9c6"], :join #object[selmer.filters$eval9689$fn__9840 0x19868230 "selmer.filters$eval9689$fn__9840@19868230"]}
user=> (render "{{title|expand-entities}}" {:title "This &amp; that"})
"This &amp; that"
user=> (expand-entities "This &amp; that")
"This & that"

Note that the result from the call to add-filter! includes :expand-entities #object[cryogen_core.compiler$expand_entities 0x74679b35, which implies the filter is registered.

I'm clearly getting something very simple wrong; can you tell at a glance what it is?