Transkribus / TranskribusCore

Note: the repo has been moved to https://gitlab.com/readcoop/Transkribus/TranskribusCore
GNU General Public License v3.0
37 stars 5 forks source link

TrpPage.getTranscripts() does not return old versions of a page #12

Closed shomauno closed 7 years ago

shomauno commented 7 years ago

There is no obvious button in the Transkribus GUI to compute accuracy between two versions of a page. compute accuracy

While attempting to gather two different transcription versions of my page and run the comparison using the Transkribus code directly, I found that the getTranscripts() method did not return more than one transcript per page, even though I had multiple versions.

multiple versions

If this is an error in my understanding of the program, please advise.

jkloe commented 7 years ago

Regarding the UI: first you have to choose the reference transcript version using the "Choose..." button and the Hypothesis the same way. Then click on the "Compare" button below and the CER / WER will be shown in a new dialog.

Regarding the TrpPage.getTranscripts() method: which code did you use to get the TrpPage object?

shomauno commented 7 years ago

Thank you! I was using a small computer screen and the icons did not resize, so the problem was that I did not zoom out and open a full screen view. Perhaps icon resizing could be a desirable future goal to ease usability.

I wrote my own code which opens a TrpServerConn connection, gathers the TrpCollection I want, loops through the TrpDocMetadata documents present in that collection (using the TrpServerConn's getAllDocs(columnID) method), finds the appropriate TrpDoc using the columnID and documentID, and then loops through the TrpPages of that document. When I call getTranscripts() on the page, only the latest version is returned in the List, so I can only access one transcript version.

jkloe commented 7 years ago

Ok, so if you query for the TrpDoc using collection-id and doc-id you should use the method TrpServerConn.getTrpDoc which has as third parameter "nrOfTranscriptsPerPage". If you set this to -1 you should get all transcripts. Hope that helps.

shomauno commented 7 years ago

Thanks a lot! I had set the argument to 1, which explains my problem.