erc-dharma / project-documentation

DHARMA Project Documentation
Creative Commons Attribution 4.0 International
3 stars 3 forks source link

Static pages and (relatively) permanent temporary URL / URI for digital editions #263

Open danbalogh opened 5 months ago

danbalogh commented 5 months ago

At least for some of us, the static pages on https://erc-dharma.github.io/ no longer update. I noticed this a while ago and heard yesterday that they do update for Samana, then heard today that they also do not update for Amandine. So this may depend on some other factor that needs looking into. (It may simply be the case that modifications in previously existing files do update, but newly pushed xml files fail to appear. I have not checked this specifically.)

It also seems that some project members do not even know about the existence of the dharman.in website, so perhaps a circular informing everyone about what this is and what is on it may be useful.

More importantly, we need to know where our digital editions are (and will for some time remain) accessible on the internet until the DHARMA base becomes functional. I am sure I am not the only one with publications in press that want to tell readers where they can look up an edition (or sixty) online. I personally would be much happier if I could continue referring to static pages on github.io for this purpose, because, to my perception at least, that is a more professional and detached publication platform than dharman.in, which is essentially a private website. But if the decision has already been made to deprecate the static pages on github and replace them with pages on dharman.in, I can accept that. I would, however, like to be informed (and I think every project member should be likewise informed) that this is indeed a decision that the project leaders have made after serious consideration, and that we should all discontinue referring to github.io in printed publications, and instead use dharman.in urls. We would then also need to be told exactly how such a URL can be generated from the file name (or other basic date) of a digital edition, and be assured that such a URL will remain functional for at least some time after the debut of the DHARMA Base. At present, it seems possible to access digital editions simply at https://dharman.in/display/, which is much simpler than the messy per-repository URLS on github.io, but can I rely on these simple URLs to remain workable for some time? And would it not be possible to implement a similarly simple display URL on github itself?

manufrancis commented 5 months ago

Hi Dan, I understand your concern. I will see with Michaël next week (when he will be back from leave), what can be done and then seriously consider the matter with project leaders. And certainly a circular will be sent in due time.

danbalogh commented 5 months ago

I wonder if the non-updating of the static page is related to the error messages I keep receiving after every time I push. Or this may be completely unrelated. It has been going on for a long time, and it has never bothered me too much, but I gather that not everyone is getting the same messages. I have now looked at the logs and see that I have been receiving these for about a year, since 18 January 2023 (or longer if the error messages auto-delete after some time). The message I receive a few minutes after pushing is "Editorial updates: All jobs have failed" or, I think for the first time today, "Editorial updates, Attempt #2: All jobs have failed".

michaelnmmeyer commented 4 months ago

The static website indeed often fails to update itself. This is because Zotero's servers are often unavailable. (I believe they have a rate limit, but this is not documented.) If there is a single error while processing a repository, the whole update is discarded, so larger repositories are more prone to problems. Likewise, if a repository contains an XML file that is not well-formed (culprits are here: https://dharman.in/texts?severity=fatal), the whole update process is aborted.

There is no easy way to address this. The most robust solution I can think of is to reproduce the same functionality on the DHARMA server and to query this one instead.

It must also be noted that our three or so newest texts repositories (Tamil stuff only) do not have a display on the static website.

danbalogh commented 4 months ago

Thanks for this info. There are no fatal errors in my repository (nor errors; there was one warning that I corrected today), but there may well be some problematic Zotero calls in some of my files. This still leaves the primary question though: are the static HTML editions on GitHub becoming deprecated? are they being replaced by those on dharman.in? what semi-permanent URLs can and should we use to refer to our electronic editions?

michaelnmmeyer commented 4 months ago

For now, the only future-proof way to refer to a digital edition is to allocate URLs on purl.org and to redirect them to either erc-dharma.github.io or to dharman.in. Both websites will be deprecated by the end of the project, thus referring directly to them, without the indirection permanent URLs provide, should be avoided in publications.

The definitive address of the website is dharmalekha.info. It does not work yet, and I cannot set it up for now, because our server provided by HumaNum is down and has been so for weeks. (The application is currently running on my personal server.) I plan to use URLs of the form https://dharmalekha.info/texts/DHARMA_INSSomething for referring to texts.

arlogriffiths commented 4 months ago

I think the decision to abandon the .io site needs to be reconsidered, and we need Michaël to invest some time in making it fully operational again (and gthen keeping it operational), unless an equally convenient alternative can be furnished. One aspect of the .io site not yet discussed so far is the fact that it was a general access point to metadata spreadsheets. At the moment, I find that I can no longer access the metadata for tfc-campa-epigraphy though this url https://erc-dharma.github.io/tfc-campa-epigraphy/index-mdt.html and I don't know how else to find the file.

danbalogh commented 4 months ago

I strongly agree that github.io should not be abandoned. If anyone looks back on my earlier posts, I was not asking for a future-proof method of reference, but for something that can be expected to work until a formal DHARMA website can become operational. What we need is a public face for the in-progress DHARMA editions, and in spite of all its merits, dharman.in does not work well as a public face, since the documentation it contains is obviously behind-the-scenes stuff, meant for project members and not for the audience at large. Conversely, the editions displayed on github.io have a fair amount of public documentation, and even if some of that is half-baked, it can do pretty well for the interim.

And whatever the PIs decide in this connection, I don't think I'm the only one who wants to refer to e-editions in a printed publication. Even if this is not a priority for anyone, the book with the Deutscher Orientalistentag DHARMA panel papers, that Annette and I are editing, is soon to go to press. There are well over a hundred DHARMA e-editions referred to in that book, from various corpora, and we need to be able to tell the readers how to derive a workable URL from a DHARMA filename. I'm assuming (and telling those readers) that the definitive DHARMA site will have a search function where you can enter the filename and get the edition - so I repeat, the question is just what URL to refer to until then.

manufrancis commented 4 months ago

I had a first discussion with Michaël and will get back with options and solutions. But please allow us time to do so.

BUT I do not want to invest too much time in github.io as this will become obsolete. My delivery point for online editions is the end of the project.

@ Daniel about "Editorial updates: All jobs have failed" This is an example of things set up (editorialisation of quotation marks acc. to language; space around colon acc. to language), at cost of time, that, in the end, will not be necessary, since we will deal with this otherwise.

danbalogh commented 4 months ago

Simply disabling things like printer's quotes and language-based colon spacing in the github.io display sounds like a perfect solution to the problem then: no more failed jobs and no more time wasted on development that will not be eventually needed.

michaelnmmeyer commented 4 months ago

I modified a few things to make the build process more reliable. Errors are still possible, but should happen less often.

danbalogh commented 4 months ago

Sounds good, thank you.

manufrancis commented 4 months ago

Thanks, Michaël! It seems to work fine for my repos too. No need to create display on the static website for the newest Tamil text repositories.

danbalogh commented 4 months ago

And no error message since my push yesterday :)

danbalogh commented 4 months ago

My main question for now is: can we expect that URLs analogous to https://dharmalekha.info/display/DHARMA_INSVengiCalukya00003 will continue to work at least until the final and definitive DHARMA website takes over? If yes, then this is enough for our purposes; and if not, then this is a strong desideratum. I have also opened the new issue #265 with some further suggestions for the dharmalekha website.