plazi / arcadia-project

2 stars 1 forks source link

webhook use for processing on Frankfurt #225

Closed myrmoteras closed 1 year ago

myrmoteras commented 1 year ago

@gsautter what would happen, if we tell the SA to upload the files to the Zenodo https://zenodo.org/communities/biosyslitcontrib/?page=1&size=20 ? For example, does this discover, what format the file is and the chooses the right way to decode? Would a scanned file properly be processed?

gsautter commented 1 year ago

Unless you specify otherwise on upload (and the webhook does not), the PDF decoder assumes a PDF to be "generic", i.e., its nature having to be determined ... it then looks at the contents, decides between on OCRed and born-digital, and decodes it accordingly.

That said, it's well worth a shot to try enqueuing OCRed PDFs for decoding via the Zenodo and the webhook ... whether or not they get processed any further depends upon whether or not we have a template, though.

myrmoteras commented 1 year ago

@flsimoes let's do some experiments using sister-publications of those that the Southafricans used and upload to Biodiv and see what is happening?!

We can get the UUID of the article to be opened and checked from https://tb.plazi.org/GgServer/pdsStats

flsimoes commented 1 year ago

@myrmoteras what exactly would be sister-publications?