Open danbalogh opened 10 months ago
Some relevant bits from #250:
Michaël said it is not a problem to generate a smart display of the language information extracted from the XML, such as "Text in Tamil, with parts in Sanskrit. Translations in English and in French"
Arlo said, "I am fairly sure the matter of redundant representatiion of language and script metadata was discussed by Adeline, manu and myself when we were working on the template and guide for the metadata spreadsheet, but I don't remember why we accepted/required the redundancy."
Dan said, "I also think that the redundancy was discussed and also have no clear recollection of the details. But I think that back then the idea was to record something slightly different in the metadata, perhaps by allowing a freetext description of the language of the inscription (e.g. "non-Standard Sanskrit" or "Sanskrit with boundary descriptions in Telugu"). That way, the redundancy is only partial. But what we have in the sheets now can be matched 100% to the data encoded in the XML, so the redundancy is a bad thing."
@danbalogh Here the last version of the metadata guide: https://docs.google.com/document/d/1RqePCIm7SOBGl0M_V_q95TU87ogJR-NsZsdRrS0YnGE/edit?usp=sharing It has been written according to the last version of the template
About the question of redundancy of languages and scripts in metadata: I don't remember why we chose to keep the mention of languages and scripts both in the metadata and in the edition. If you feel this information is redundant, of course it can been deleted in one part. But we need to be sure that this information could be searched in the database. I don't know how the database will be searchable (by fields or free search)
Thanks for sharing the MDT Guide. Are the people who have already recorded metadata aware of the existence of this guide? I'll read and add comments.
On the redundancy issue: I'm not in a position to decide, but the redundancy is definitely there and so far, nobody has provided an explanation why it may be necessary or even useful. If no explanation occurs to anyone, then script and language should be removed from the metadata table. (The XML files may also have script/language information associated with specific parts of the inscription, so they contain more information than the metadata table. It is only the data in the mdt table that can be deleted.) Search with language/script filtering will of course have to be implemented.
@alevivier : I understand that you are or will soon be away on fieldwork. This is not urgent and can be addressed once you are back and have time. The matter came up in another discussion (see #250 for precedents) and we wanted to keep it in sights. Also, please assign anyone else who should be involved in this. I also have a number of other concerns about our metadata sheets, which I believe more and more people are using, yet I know of no instructions for the proper way of filling many of the cells, and several details don't work out as well as they could. I think a guide should be drafted for the metadata sheets (this one seems very much out of date), and the fields and vocabularies need to be finalised before too many people start filling them out. I'm happy to be involved in that discussion and just as happy to be left out of it so long as it takes place.
The question now is: what purpose does it serve to have entries for language and script in the metadata spreadsheets? In their current state, the fields only contain information that is also encoded in the XML files and can be pulled from there (#250), so recording the same information redundantly in the metadata seems to offer no advantage while creating space for human error.