BetaMasaheft / Documentation

Die Schriftkultur des christlichen Äthiopiens: Eine multimediale Forschungsumgebung
3 stars 3 forks source link

Clavis as a scrollable list #2697

Open eu-genia opened 5 days ago

eu-genia commented 5 days ago

Currently we have the function https://betamasaheft.eu/works/list for a filtrable list of all works and clavis IDs but the users would find it helpful to have a scrollable full list (discussion with Rafal Zarzeczny). An html static list should be regularly generated

eu-genia commented 5 days ago

the link to Clavis from the main page now brings here https://betamasaheft.eu/clavis-list.html

I hope users may find it helpful @abausi @CarstenHoffmannMarburg @thea-m @Ralph-Lee-UK @DenisNosnitsin1970 @AaronButts @DenisNosnitsin1970 @nafisa-valieva @karljonaskarlsson

karljonaskarlsson commented 4 days ago

the link to Clavis from the main page now brings here https://betamasaheft.eu/clavis-list.html

I hope users may find it helpful @abausi @CarstenHoffmannMarburg @thea-m @Ralph-Lee-UK @DenisNosnitsin1970 @AaronButts @DenisNosnitsin1970 @nafisa-valieva @karljonaskarlsson

Thank you very much for this, Zhenia!

eu-genia commented 4 days ago

Since @karljonaskarlsson has volunteered to try and correct at least some of the titles to get a cleaner list I have prepared a spreadsheet https://docs.google.com/spreadsheets/d/10JRleQXUl1t2Q_uvM5QAxOmOk2G5nLE7aAjo-1iS0AQ/edit?usp=sharing

It might be easier to specify the corrected version there and I will try batch replace from there by XML refactoring.

This might save time to everyone (we can also ask someone for help with corrections).

eu-genia commented 4 days ago

Another idea - since the server is sometimes offline - would it be helpful to have a copy of some of the static lists like this one (I could create more for other things) on uni-hamburg page? (and possibly also try to get a redirect for moments when the server is unaccessible? I could try to think of how to achieve this...)

karljonaskarlsson commented 4 days ago

Since @karljonaskarlsson has volunteered to try and correct at least some of the titles to get a cleaner list I have prepared a spreadsheet https://docs.google.com/spreadsheets/d/10JRleQXUl1t2Q_uvM5QAxOmOk2G5nLE7aAjo-1iS0AQ/edit?usp=sharing

It might be easier to specify the corrected version there and I will try batch replace from there by XML refactoring.

This might save time to everyone (we can also ask someone for help with corrections).

(Oh sorry, I updated the titles in a normal pull request – I didn't see that you created this spreadsheet. If I update more titles later, I will use this method, which is actually easier.)

DenisNosnitsin1970 commented 4 days ago

Thank you very much! A static list would be also helpful.

eu-genia commented 4 days ago

@nafisa Please explain exactly what you mean, what you expect and where, as a separate issue. This one is about a static list of IDs. Thank you.

If you are speaking about @type for titleplease consider that we already have type desc
supplied syriaca.org: an existing print catalogue provides a descriptive title
uniform syriaca.org: a title for such work established by the editor or cataloguer
eu-genia commented 4 days ago

The list is now also available here, accessible also when betamasaheft.eu is offline https://www.betamasaheft.uni-hamburg.de/texts/clavisaethiopica.html

smaugustine commented 3 days ago

Just to float the idea out there, this could also be achieved through the GitHub API and then it might save some compute resources.

It's also possible to use static generators (with some XML parsing). I developed a proof of concept that uses GitHub Pages with submodules, a Jekyll site built with GitHub actions, and some custom Ruby code: https://cae.ethiopicist.com/ (https://github.com/ethiopicist/Clavis). This could also be run entirely without internet if someone maintains a local clone of the repository (e.g. for fieldwork in Ethiopia).

eu-genia commented 3 days ago

@smaugustine wow, thank you, can you teach me/us more about it?

eu-genia commented 16 hours ago

@smaugustine maybe a stupid question: is there a reason why for source data (xml) you point to the version at https://github.com/BetaMasaheft/Works/tree/62821d3d9cabbcec1c49b220bbd64d7f1194d899 ? is it possible to point to https://github.com/BetaMasaheft/Works/tree/master ?

smaugustine commented 13 hours ago

Submodules by design point to specific commits and do not allow general linking to branches. It's inconvenient but in case of code libraries/dependencies etc. it avoids updates to the submodule breaking whatever depends on them. So updates need to be pulled and synced as in a normal git clone, but if the data does not need to be the most up-to-date then one could schedule the pulls daily or weekly.

The underlying idea here is basically taking advantage of GitHub's free static hosting and long support for static site generators, particularly Jekyll (which is built in Ruby but easy enough to learn on the go). Jekyll ingests Markdown/YAML/JSON, applies templates, and outputs static HTML files which are ideal and the most efficient for data that does not need to be updated with a high frequency. We could access the XML data through GitHub's API, but for this setup it's more effective to use a submodule and avoid API usage limits. Then using the Nokogiri gem we can apply XPath queries to get the data we need and make it available as a Ruby object for Jekyll to apply templates to. The site would need scheduled rebuilds, but this could be combined with updates to the XML data submodule.

Aside from having static pages accessible online, there is also the possibility for offline use. With a local install of Ruby (and Bundler + Jekyll), anyone could clone the code repo with the data submodule and then use it locally without the need for an internet connection.