BetaMasaheft / Dillmann

Dillmann Lexicon
0 stars 0 forks source link

Cross-references #430

Open eu-genia opened 2 years ago

eu-genia commented 2 years ago

while introducing line breaks for compounds to avoid errors I realized that the cross-references marked as sense=C are also a problem, as there may be multiple occurrences of the same ID. Senses (<<..>>) are all supposed to have unique IDs. One may not assign C more than once per entry. We have to find a fix for cross-references too. I guess originally Cross-references were only meant to be once at the end of the block in Dillmann, but with TraCES and expecially with Compounds the Cross-references blocks multiply. I have attempted to introduce xr (https://betamasaheft.eu/Dillmann/lemma/L153553e0e5e0448f9ba1221d2cf70045) but cannot yet get mark up within the element to work (the mark up disappears on conversion from web form to xml). I will try to find a solution but for the moment we need a list of all records that would need fixing, i.e. those that have several cross-references sections, if there are any in addition to the ones listed already in the Compounds issue (https://github.com/BetaMasaheft/Dillmann/issues/420) @MagdaKrzyz do you have an overview?

All in all, improvizations should be kept to a minimum.

The web-based workflow with conversions to and from xml poses a lot of challenges.

MagdaKrzyz commented 2 years ago

I don't have a separrte list of entries with multiple cross-references. I can prepare a list that would include many of them but unlikely that I will find all unless I go through all the entries. If a complete list is now needed I can do it.

eu-genia commented 2 years ago

the problem is invalid xml may contribute to system instability as we were told... this only regards multiple blocks of cross-references within one meaning (eg TraCES), if there is one in Dillmann one in TraCES it is OK. I will try to look for them too at some point, after I have found a good solution.

(It would all be much easier if the lexicon had an xml-based workflow, without the conversion... maybe at some point we need to change that)

MagdaKrzyz commented 2 years ago

There are three instantiations of cross references: 1) There is the label CROSS REFERENCES and underneath a few individual CR separated by a bullet point, e.g. https://betamasaheft.eu/Dillmann/lemma/Le210848af4284f798b8d8bd16d5bd84c 2) There are a few cross references assign to different senses and subsenses within one entry. Usually one CR to one sense/subsense, e.g. https://betamasaheft.eu/Dillmann/lemma/L968149f7fc1748479a9f8482bc3f49c0 3)Combination of both. Here is an extreme case with many different CR: https://betamasaheft.eu/Dillmann/lemma/Ldc8c69eda00b4756a7a533d684abcdd3

I understand that cases 1 and 3 make xml invalid. How about 2?

eu-genia commented 2 years ago

Not quite. The problem is when you use <C< ...>C> more than once nested within the same upper-level sense. 1 - In this example, <C< ...>C> is only used once, there is no problem. (or rather there was an error in the XML, but it is not visible on the surface, probably caused by some deletion and insertion that left traces, now cleaned) 2 - here I do not see <C< ...>C> at all. 3- Again, this is all valid, as all cases of <C< ...>C> belong to differently numbered senses.

Probably after all the problem is only with Compounds as there you have the unnumbered list which is what causes problem. Actually the easiest solution would be to have the Compounds as a numbered list (not dashes or line breaks), then we have no issues with Compounds and no issues with Cross-references.

We should simply take a-b-c-d etc for Compounds.

I actually do not like bullets, as I already have written, the mark up should be meaningful, the display is another matter. They are not invalid, just not good practice.

MagdaKrzyz commented 2 years ago

Bullet points divide different cross-references as in entry 1. But you say it is ok, so why shouldn't they stay? What could be an alternative? Ok, I will take a-b-c for Compounds. Line breaks for Grebaut are ok? Just to be on the safe side...

eu-genia commented 2 years ago

Line breaks for Grebaut can stay.

But you should try to understand that mark up should be meaningful. So instead of inserting bullet points the list should be structured with list items, and then on the visualization level one can visualize items with bullet points, with dashes, with whatever you want.

At the moment there is just display, no content behind, which is very bad practice.

If we ever update the workflow we will absolutely have to remove this. Digital is not the same as printed. You can print it, but primarily it is meant to structure and filter and search metadata. I had tried to explain all this in my XML introduction...

eu-genia commented 2 years ago

(For the moment I have removed the line break after the subheading Cross-references in display, I think it is not needed there)

MagdaKrzyz commented 2 years ago

I can only work with what is there on the visualistion level. Since there is a label Cross-References I can turn the list of bullet points into an a-b-c- list.

eu-genia commented 2 years ago

leave it, it is not so important, we will think about it later. there are no subheadings below cross-references so it does not disturb.

turning compound lists to numbered lists is more important.

MagdaKrzyz commented 2 years ago

ok the app does not work. I get a message :

/db/apps/gez-en/edit/edit.xq exerr:ERROR Could not send message(s)java.net.ConnectException: Connection refused (Connection refused) [at line 225, column 6]

eu-genia commented 2 years ago

I see... I don't know what got broken... Or why would it want to send the message...

eu-genia commented 2 years ago

I have disabled for now the "sending email to the editors" option which as it seems suddenly stopped working, I don't have the time to explore why... if we need it I will do it if i ever have the time Otherwise should be back in function

eu-genia commented 2 years ago

I have tried with the list in https://betamasaheft.eu/Dillmann/lemma/L153553e0e5e0448f9ba1221d2cf70045 but I still have the problem with the cross-references Here both <C< are actually on the same level, they are not nested in the submeanings, but I am not really sure one can get them nested...

MagdaKrzyz commented 2 years ago

A propos meaningfulness, what happens to a piece of text that is not specifically marked up within an entry? Is it valid? For instance, if there is no translation I write simply "meaning unknown", or, if there is no translational equivalent but a more elaborate description of the meaning it is written down after the pronunciation. Look here at the beginning of Grebaut's entry: "présente dans la langue théologique souvent les mêmes sens que les noms abstraits" https://betamasaheft.eu/Dillmann/lemma/L6eb57b4154f4410c98848053aaaec9ff Can it stay like this?

And one more thing concerning visualisation --- it is not good to my mind in terms of kind of fonts and their sizes. I don't want to burden you with work on DL since this is not your task. Maybe if you have more relaxed period we could do sth. about it. For instance, the translateration font is too big in comparison to the others. The Ethiopic font, I recon Nyala, is not well readeable. Ludolfus or a similar font would be better. But this for the future.

eu-genia commented 2 years ago

1) let us not confuse meaningful and valid. Of course it is valid. 2) font size is easy (the size is the same, it is only bold to make it better visible, of course if you have a better idea open an issue and write exactly what you want) but of course we cannot use Ludolfus unless you can guarantee that it is on everyone's computer all over the world

you must try to understand the difference between an analogue printed book and a digital product, they have absolutely different scopes, tasks and structures

eu-genia commented 2 years ago

OK nesting works now, upper case letters were not available on lower levels, now added https://betamasaheft.eu/Dillmann/lemma/L153553e0e5e0448f9ba1221d2cf70045 is fixed