BetaMasaheft / Documentation

Die Schriftkultur des christlichen Äthiopiens: Eine multimediale Forschungsumgebung
3 stars 3 forks source link

Missing foliation in MDA003 #2390

Closed DenisNosnitsin1970 closed 3 days ago

DenisNosnitsin1970 commented 1 year ago

I have noticed that the folia numbers are missing in MDA003, am I right? This has three immediate consequences: 1) the orientation in the ms is only possible through reference to "frames", 2) no professional description can be done without folia numbers, and the further use of this material will be very difficult, 3) if the person who digitized the manuscript does not foliate it, there will be unlikely another one who will do that. For those who digitize the manuscripts now, in our days, and submit their images to BM for making descriptions and uploading, I suggest they should foliate them. If it could not be done in the field, the numbers can be introduced into the images manually, using simple picture processing software. This is not difficult, and this is for our common benefit and this will be very helpful.

eu-genia commented 1 year ago

Unfortunately EAP and Abennat manuscripts were not foliated, I tried to insist with @MershaMengistie on the necessity of foliation but unfortunately this could not be implemented. One could eventually draw numbers manually in the images but this is a task someone has to do, I cannot do it.

DenisNosnitsin1970 commented 1 year ago

I understand. If I open the description https://betamasaheft.eu/manuscripts/MDA001/main I see the folia numbers there, but they are not in the images. The user have to figure out the folio number through the number the frame, and to the end they do not coincide, or the description is not complete, it is difficult to understand without the folia numbers. And this is a very simple manuscript. To multiply such descriptions is not the best practice, I think we should discuss this.

CarstenHoffmannMarburg commented 1 year ago

I agree, that it is a significant burden to work with a manuscript, that is not foliated. It happens again and again, that the correct folio number must be counted or reckoned again and errors occur very easily. I do not know, how much time it takes to insert numbers into the scan images, but I guess it is similar to the time one loses by confusing the foliation when encoding such a manuscript.

CarstenHoffmannMarburg commented 1 year ago

We will talk in an informal Webmeeting on friday about our agenda for the upcoming weeks. We probably find a solution then.

MershaMengistie commented 1 year ago

Sorry. It was not actually in the requirement of the BL to foliate mss and as said by Zhenia, they were not foliated. As to the abennat project: there are of course mss which are foliated by the team following the direction given by Zhenia. For those which were not foliated (images of the abennat project), let us devise a mechanism to add page numbers on the images. Tsehay and Senkoris can work on that. For the rest, would be great if we can have digitization guidelines for the house :)

eu-genia commented 1 year ago

From COMSt Handbook: image

image

MershaMengistie commented 1 year ago

Got it, Zhenia! Didn't know that we apply what is in COMST. We penciled page numbers on those mss which were owned by cooperative scholars. The majority do not allow to do anything with their manuscripts. Anyway, we did try and will continue to do so. Foliation should be one of the points to be dealt and agreed upon with.

Addisie commented 1 year ago

As you all said it is very good that to foliate the manuscript before we digitize. But still, we should consider that there are people and organisations who don’t allow our foliation on their manuscript which we should respect their interest as a researcher. In this case, we can folate the manuscript digitally on the image files that is what I did in my little experience.

eu-genia commented 1 year ago

@MershaMengistie the ones that I saw are not foliated digitally. you renamed image files which is as I had told you is quite useless (and actually disturbing, you should leave the filenames as they are, as we get problems with sorting, when I do the processing - we already had the exchange about this problem).

Digital foliation means inserting visible numbers in the images so that opening an image you can see on which page you are. I do not know if @Addisie did it in her image copy (?), digital foliation is not virtual, it is physical. @Addisie if you have sets of images where you inserted numbers please share them and I can use them on the server.

When pencil numbers are not allowed one can either

(1) put a small note with a pencilled number next to the colour ruler (the best and easiest, as I had tried to explain before, as requires no post-processing) (2) process the images manually, inserting a visible number into each (we should also decide however if we start foliation from 1 like Ethio-SPaRe did, including the protective quire into the numbering, or we start from the main text - this decision has never been taken AFAIK)

Unfortunately this cannot be done fully automatically as one runs the danger that eventual double shots get continuous numbering, or skipped shots ignored, so foliation must always be done manually for each page.

eu-genia commented 1 year ago

I have now added small numbers to MDA003 images (you might need to clear cache to see the change in the viewer) but these are automatic so no guarantee that they are correct

MershaMengistie commented 1 year ago

Zhenia: I guess we should not continue talking about a project which was done 13 years ago and in another program framework. Since there was no direction from the BL that I should foliate the andemta mss before documentation, I did all as they are. I think we sorted out all the problems pertinent to it except a couple of issues that should be examined [but ONLY] when I can access my other ext. driver. And please be informed that I didn't foliate a single ms digitally. I am more on the ongoing project though! And your notes on penciling were welcome and again welcome.

MershaMengistie commented 1 year ago

Again if it helps: the EAP asked to include a checksum manifest for each digital folder. And that I did; and it is found in each folder of the EAP 336. I guess no error was detected both by myself and them. As far as I remember, no one from the EAP communicated to me with an issue of that kind.

eu-genia commented 1 year ago

Dear Mersha, this unfortunately is not helpful at all for cataloguing. Just like with associating each image set with a unique shelfmark/metadata set, it is important that when looking at a page all know which page number they are looking at, and when referring to a folio in a manuscript description it must be clear which folio is referred to. Checksum confirms that the number of images delivered is the number of images declared in the folder, it is not a replacement of foliation and no guarantee that there are no missed shots as we have no description to rely on; the number of tiff images is clearly not the same as the number of pages as many of the images are e.g. of binding. Checksum is a purely technical parametre that is not associated with scholarly digitization for cataloguing. Let us try to do it better in the future. For now I can offer to all cataloguers who encounter a manuscript that is not at all foliated ask me for the image set and insert the numbers manually, I will then reconvert the images to put them back on the server (even if in some cases we might get some quality loss).

MershaMengistie commented 1 year ago

So you have got a solution now. Excellent. ጀግኒት።

eu-genia commented 1 year ago

This is just an emergency way for now which requires a lot of additional work time (open each image, insert number, save, close for the cataloguer, and then reprocess the whole batch; transforming a set of images for the IIIF server, even automatically, each time takes between 30 minutes to 1 hour for each single manuscript).

To give an example to be all clear

Now in Abennat we have e.g. GBM001 folder Let us look at the file called GBM001 (9).JPG

When we open the image, what we see is image

There is clearly no indication that this is f. 9r. Could be any.

So one should at this point manually insert the number somewhere like image

This would not be necessary if when digitizing, even when pencil marks are not allowed, the digitizer would simply place a small note with the number and photograph it together with the book image

(And again a side note on the filename: GBM001 (9).JPG is a file name that may never be assigned as (1) there is blank space which is not supported by most scripts and (2) the number "9" is a one-digit number, automatically ordering the files by filename by any PC will get the order 1-10-100-11-110-111-112-113-114-115-116-117-118-119-12-120 etc etc with 9 finally coming after 89 (it orders alphabetically) for any mss having more than 9 folia; if one does want for some reason to do the file renaming then one should always use as many digits as the highest number in the sequence has, so in this case 001-002-003-004 etc. - only then 009 will be in the right place. The best is to not rename files and keep them as assigned by the camera if they are photographed sequentially; I rename them anyway automatically when converting but I can only do that easily if the sequence is right, otherwise I must spend hours of additional work of manual renaming to get the sequence right)

I hope that with all this we shall get better.

Addisie commented 1 year ago
  1. Yes, I digtally folated my PDF manuscripts which I will send to Zhenia. I could send you only MDA001,MDA002 and MDA005. because I don't have the PDF file for MDA003.
  2. I wrote the all these information in XML discription in the element folation.
  3. But, still I had to deal with many problems, because after my foliation I found that some pages were twice photographed. I had to correct the foliation. The foliation and correction took a lot of time.
MershaMengistie commented 1 year ago

Addisie,  Good that you have finally submitted those three mss. ጎበዝ። I also noticed that you have dealt with so many problems. Important is you learned through the process. That was the purpose when I gave you those randomly selected PDF texts: for you to start exercising. I don't know which ones you did. I look forward to see them. So here you are now after almost two years of exercising. 

Sent from Yahoo Mail on Android

On Thu, 22 Jun 2023 at 16:44, @.***> wrote:

— Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you were mentioned.Message ID: @.***>

eu-genia commented 1 year ago

Thank you @Addisie, perfect. Exactly this is why automatic foliation is impossible: I have now inserted numbers in MDA003, please shout if these must be corrected! I have now added a short subpage in the Guidelines (https://betamasaheft.eu/Guidelines/?id=images, scroll down to Digitization), hope these are easy to follow

eu-genia commented 1 year ago

@DenisNosnitsin1970 @SophiaD-M @thea-m please do feel free to correct/adjust the Digitization guidelines section

thea-m commented 1 year ago

@SophiaD-M @Addisie and I are in the process of compiling a detailed handout for digitization for the summer school (based on a previous presentation by @SusanneHummel ) , under supervision of @DenisNosnitsin1970 We can see (when we are done and have also tested it in the classes) how it can be integrated into the guidelines in a good way, it is of course great to make this information available

eu-genia commented 1 year ago

Thank you! We can put the detailed handout under Training materials and link from the short Guidelines section?

CarstenHoffmannMarburg commented 1 year ago

Contrarily to what I have said in the latest Webmeeting, all manuscripts from the Florentine Biblioteca Laurentiana do have foliation marks. I have just checked them.

CarstenHoffmannMarburg commented 1 year ago

I have turned to encode Ṭānāsee 1, because this manuscript is very relevant to my research at the moment. I have seen, that there are foliation marks, that have been inserted with small pieces of paper put on the manuscript during the digitization process. Alas there are just only every fifth folio marked like this and for the last folios (after f. 200) the inserted foliation marks are incorrect (mark 205 is infact on folio 204 and so on till the end). This has confused the cataloguer Ernst Hammerschmidt, who did not give correct folio numbers for these folios and it will confuse very many users even if I describe it in the foliation record. Do you think it is worth the time to insert folio number manually in the pictures or does it take to much time and energy to do this for all the manuscripts with distorted foliation marks?