Closed pstroe closed 2 years ago
Hello, thank you again for your enthousiasm and these propositions. The "htr-united.yml" file in htr-united/ should only be modified via Github actions. To add your metadata, you need instead to create 1 file per dataset within a folder dedicated to the project(s), inside htr-united/catalog/.
For example the description you added for "Gwalther HTR GT" should go in a file named gwalther-htr-gt.yml
inside a folder named bullinger-digital/
(I got this info from the Zenodo repo, maybe you don't want to mention the project?) inside catalog/
.
If you are not sure how to do, you can:
Let me know if you need more explanation, and of course if you found it was not clear enough, tell us so we can improve the instructions! :)
I don't know if this is normal but the files seems to be empty. If you need some help, we can apply some changes ourselves :) I just recorded a demo video (no sound) if that can be helpful (we are still learning !)
thanks you all for replying to the pull request. I'm now on it correcting the issue. i think I just pasted the metadata in the wrong file. 10 minutes an we should be good to go. btw: the download feature does not work for me. it redirects me to the start of the page (this is on google chrome browser)
i think now it should work. could you please try again?
There is a small formatting issue, description should look like this (we are fixing this in a current pull request to the form)
description: >
This is ground truth for Rudolph Gwalther’s (1519-1586) handwriting taken from his book "Lateinische" Gedichte", where he accumulated writings between 1540 and 1580.
Data collection and ground truth creation:
At the time we collected the data, we found 150 images with corresponding transcriptions by Peter Stotz on e-manuscripta (reference: Gwalther, Rudolf: Lateinische Gedichte. Zürich, 1540-1580. Zentralbibliothek Zürich, Ms D 152, https://doi.org/10.7891/e-manuscripta-26750 / Public Domain Mark) . We removed 8 images with too many corrections or vertical texts. Next, we uploaded the images into the Transkribus platform, applied the line recognition tool and manually copied the transcribed text lines into the recognised line boxes. During this process, we made some corrections, which were mainly due to inconsistencies in punctuation and capitalised letters.
The volume
key is required. I took the liberty to run HUMGenerator on your data to get the numbers :)
Thanks and sorry for the multiple bugs :) We are gonna fix the form accordingly !
first: Gwalther HTR GT second: NZZ GT for printed newspapers