Closed myrmoteras closed 1 year ago
Unless you specify otherwise on upload (and the webhook does not), the PDF decoder assumes a PDF to be "generic", i.e., its nature having to be determined ... it then looks at the contents, decides between on OCRed and born-digital, and decodes it accordingly.
That said, it's well worth a shot to try enqueuing OCRed PDFs for decoding via the Zenodo and the webhook ... whether or not they get processed any further depends upon whether or not we have a template, though.
@flsimoes let's do some experiments using sister-publications of those that the Southafricans used and upload to Biodiv and see what is happening?!
We can get the UUID of the article to be opened and checked from https://tb.plazi.org/GgServer/pdsStats
@myrmoteras what exactly would be sister-publications?
@gsautter what would happen, if we tell the SA to upload the files to the Zenodo https://zenodo.org/communities/biosyslitcontrib/?page=1&size=20 ? For example, does this discover, what format the file is and the chooses the right way to decode? Would a scanned file properly be processed?