Closed nathangibson closed 1 month ago
Thanks so much, and apologies that it took me a while before I saw your reply!
Thank you for this submission. Is it a work in progress or are you trying to submit it as is?
We would like to do more work on it but think it is already useful.
There are several problems which need to be fixed before the entry can be added to the catalog.
* you need to provide a list of authors
Done (see the above merges).
* it would be helpful to provide a more precise description of the dataset so that potential re-users can understand what is in the dataset, in particular since the images are not freely available for a portion of the dataset (if I understand well your documentation)
Will work on this -- basically explaining the image rights?
* are the transcriptions really spanning from 900 to 1900?
Yes, although there are only a few pages of the later material.
* I think the rules listed in your transcription convention could be pasted in the "transcription guidelines" field (but this is something I can fix).
Done.
That being said, my main issue is actually that I am not able to load the dataset in eScriptorium. I get the following errors when I do (see below), which might be caused by the fact that the value in "fileName" does not match the names of the image files. I tried on two instances of eScriptorium (v0.13.8b and v0.13.4b) with the same result. Did you try to import them in eScriptorium? Which version of eScriptorium did you use to export them? Did you generate them all in the command with Kraken? If yes, with which version?
Sorry, I think the issue was changing filenames after download, without realizing this would mess up the METS import. I've corrected this now. (e.g. https://github.com/biblia-arabica/academies/tree/main/htr/ground-truth)
Another main issue was that it wasn't so clear where the ground truth was. I've restructured to make this clearer. If you think https://github.com/biblia-arabica/academies/tree/main/htr/ground-truth is in order I will do the same for the other manuscripts. Thanks for your input!
@alix-tz Can you have a look ?
Ok, we are good now I believe!
I'm sorry @nathangibson that this took so long, I hadn't realized that you had updated the metadata because you didn't report the changes in the file attached to this PR. But that shouldn't have blocked us from moving on with adding the description in the catalog.
Thank you very much for your contribution!
@alix-tz Thanks for this! And apologies, I'm not familiar with the process so I didn't realize about the file attached to the PR. I appreciate your including us!
Hello,
Thank you for this submission. Is it a work in progress or are you trying to submit it as is?
There are several problems which need to be fixed before the entry can be added to the catalog.
That being said, my main issue is actually that I am not able to load the dataset in eScriptorium. I get the following errors when I do (see below), which might be caused by the fact that the value in "fileName" does not match the names of the image files. I tried on two instances of eScriptorium (v0.13.8b and v0.13.4b) with the same result. Did you try to import them in eScriptorium? Which version of eScriptorium did you use to export them? Did you generate them all in the command with Kraken? If yes, with which version?
Note that some of the errors are normal, I didn't load all the images.