projectEndings / staticSearch

A codebase to support a pure JSON search engine requiring no backend for any XHTML5 document collection
https://endings.uvic.ca/staticSearch/docs/index.html
Mozilla Public License 2.0
46 stars 21 forks source link

Allowing markup in filter labels #248

Closed martindholmes closed 2 months ago

martindholmes commented 1 year ago

At the Balisage presentation today, someone (sorry, can't remember who raised it initially) pointed out that we're using attributes for filter labels (meta/@name), and this is potentially problematic for some language contexts. This is something that's bothered me for a bit, and I think the solution would be to allow an optional @data-label-id attribute on the meta tag which would point to the id of an HTML element containing a richer filter label for display. The question is where that element should be; we don't want to stick it in every document in the collection that uses the meta tag, so perhaps it should go in the config file?

joeytakeda commented 1 year ago

I wonder if we could re-use the recently made (#186) caption file? I.e. you could put your own caption in the map with different languages, if you want. If there's a @data-label-id (or if we wanted to make it "namespaced", @data-ssLabel-id or just @data-ssLabel), then we look to the captions file for the id (and fail if we can't find it?)

We also use @label on <context>: I wonder if we should make a child <label> element?

martindholmes commented 1 year ago

Whatever method we choose, I think it should be consistent between the two usages (HTML meta tag for filters and context element for search-only-in), so my instinct is against a child <label> element in <context>. But I do think it should go in the config file, which is the only file most first-time users will actually edit (we hope, once all the new languages are available). We could have a <labels> section with <label> elements with @xml:id attributes, so you could point to it from <context> and from <meta> tags in the HTML head.

But thinking a bit harder about this, it seems that captions might also require embedded markup, so perhaps we should rethink along those lines. If we checked at build time for the existence of a user-captions.xsl file in a specific location, we could include it if it exists, and pointers could point into it.

Before we do anything at all, though, I would like to revisit the basic idea that there are languages in which captions must have markup. The person who originally raised this mentioned Japanese, and I presume the intention there was the potential need to supply furigana -- I'm not aware of any other context in which modern Japanese text cannot be represented in straight Unicode. But we're not supporting Japanese anyway, because we have no usable tokenizer for the language, and I don't see any sign of one on the horizon, so this might all be a bit of a red herring.