Captions should be in one easy-to-configure place

joeytakeda commented 2 years ago

Stemming from #185 , we should make the captions for the XSLT and the JS easier to adjust. Right now, people can adjust the captions in the JS, but then they also have to adjust some in the XSLT (which is less easy to do), and it'd be much better if they were all listed in a single place.

So I think we should:

Harmonize all of the captions into a flat list of captions keys/values. I'm not sure if this should be an XML file or a JSON file, though I'm inclined to say JSON since that's easily readable by the XSLT and by the JS (though maybe slightly harder for some to edit?). Alternatively, we could make this an XML file with the caption ids as elements
Each language would get its own caption file in a folder called captions (so, right now, we'd have captions/(en|fr|de).???)
We add a new parameter to the config file called <captionsFile> that points to the captions for the project; this would allow people to create their own captions housed wherever on their system for their language set if they so wish.
That caption file is converted to JSON (or just simply copied, if it's already JSON) over the output staticSearch directory and we add the captions.json to the set of necessary documents to fetch for staticSearch.

Thoughts?

martindholmes commented 2 years ago

I think it would make more sense to keep everything in a single file, so people who might be translating can see other languages easily, but also to encourage them to do as in #185 and contribute to the project, rather than keeping their captions in their own project.

We currently detect the language from the HTML in the search page, and I think that's generally a good practice rather than adding yet another configuration item.

joeytakeda commented 2 years ago

That would mean that any single language is confined to whatever captions are set in the main file; I suppose those would still be overridable via the JS still, but not as easily in the search page creation.

martindholmes commented 2 years ago

Yes, if you wanted to use your own captions without contributing them back to the project, you would have to edit the main file manually. But that's a form of encouragement to contribute, no?

If you really wanted to avoid contributing, you could easily switch files around in your build process, of course.

joeytakeda commented 2 years ago

I think we're talking slightly at cross purposes here: my assumption here is that we want a set of boilerplate captions for each language (and want to encourage those contributions), but those captions should still be configurable since individual projects may need/want to change a caption.

You can currently change the captions for the search results in the Javascript—e.g. it's simple enough to change ss.captions.en.strTooManyResults from "Your search returned too many results. Include more filters or more search terms" to something like "Too many results found; try filtering by date or document type"—but it's very difficult to change any of the captions on the search page generated by the XSLT.

So my proposal here was a way to harmonize the search page and search results caption sets, provide a way for us to have captions for every language, and allow people to adjust/customize their caption set for the project. Point taken about adding another configuration option, but I think captions make a lot of sense to be configurable in the simplest way possible (since people deciding what the captions ought to be aren't necessarily the same people writing the code).

martindholmes commented 2 years ago

I agree completely that we're looking for the simplest way to enable people to change captions, and that all captions should be centralized; I just thing they should be in a single file, for several reasons: 1) it's much easier to confirm at a glance that all languages have all the captions; 2) it's easier to add a translation when you can look at all the other languages at the same time; 3) it's easier for the processing code to substitute an English caption if one is missing from one of the other languages; 4) the easiest way to add your own captions will then be to fork our repo, from which it's a simple additional step to send us a pull request so we benefit from additional translations. I think if someone wants to substitute a caption in a language we already cover because they don't like our caption, that's worth raising an issue about, because we may agree and then everyone will benefit.

joeytakeda commented 2 years ago

Good point re: forking/PR as the best way to contribute captions—I'm convinced that anyone wanting to make custom captions should just fork the repo anyway, so I can see why a config option isn't necessary.

I'm still not convinced though that they should be one big captions file rather than individual ones split by language. 1 & 2 seem to me personal preference when it comes to editing (my personal preference would be smaller files that I could compare side by side in separate windows than a long file that I have to scroll through).

While it would be easier to tell that the languages have all the captions, it would also be more difficult to tell which languages currently have captions (versus seeing the language codes listed out in a directory) and it would make contributing a bit more difficult, I think. With a master captions file, in order to contribute a new language, people would have to find a coherent block for another language, copy it, paste it in the right spot, and then work on the language they want to add; with small files, people just copy a file, rename it, and then modify the captions without having to mess with other things.

And in terms of processing, we could glom them together from the outset (i.e. map:merge(collection($captionDir || '?select=*.json) ! parse-json(.)) or something like that), so I don't think individual files are problematic in terms of processing either.

However, this all assumes that we're wedded to having the captions organized by language. Perhaps it makes more sense (and here is where I think a single file would certainly work better) to organize them by caption with the languages as key/values:


"ssScriptRequired": { 
 "en": "This page requires JavaScript.",
 "fr": "Cette page a besoin de Javascript.",
 "de": "Diese Seite benötigt JavaScript."
}

At first glance, this structure (akin to the TEI's) seems the right way forward to me, but you obviously have know much more about that than I do, so you'll know more about whether that actually works in practice or if it's more difficult in the long run.

martindholmes commented 2 years ago

Good points all round. I think you've convinced me that separate files by language is a good idea; we can easily diagnose situations where a given lang is missing a caption. With files organized by language, we could if we wanted to construct the caption-based map if required.

Right now, we push all the language captions out into the JS, and the JS decides at load time which set to use based on the language attribute of the page. It's arguable that since we're processing the page anyway, in makeSearchPage.xsl, we could a) detect the language at that point and include only the required captions, and b) warn about a missing lang attribute or one for which there's no caption set.

projectEndings / staticSearch

Captions should be in one easy-to-configure place #186