OCR-D / zenhub

Repo for developing zenhub integration
Apache License 2.0
0 stars 0 forks source link

Get started / implement workspace-part #103

Closed joschrew closed 2 years ago

joschrew commented 2 years ago

My User Story for the next sprint (21.6 - 1.7.): As a developer I want to get started implementing the webapi. I want to use my previously created python-project to setup a webservice. In the next sprint I want to finish a usable Service for the workspace part of the api. It doesn't have to be production-ready but to be usable for the purpose of uploading, updating and fetching workspaces as bagits.

AC: A service which can be run and which is able to accept, update and deliver workspaces according to the webapi-spec.

Links: Repo: https://github.com/joschrew/ocrd-webapi-implementation API-Spec: swagger: https://app.swaggerhub.com/apis/kba/ocr-d_web_api/0.0.1 raw: https://github.com/OCR-D/spec/blob/master/openapi.yml

joschrew commented 2 years ago

I noticed that the webapi did not define an endpoint to download a workspace, i.e. to download all files as zip or OCRD-Zip after it has been processed. Together with Triet we came to the conclusion that it can stay like this. As a result, it remains in the hands of the implementation projects whether and how a workspace can be downloaded. An endpoint could be added for this if needed, but it is not mandatory.

kba commented 2 years ago

I noticed that the webapi did not define an endpoint to download a workspace

There is though, GET /workspace/{workspace-id} will return a previously uploaded/registered workspace. Or do you mean something different?

joschrew commented 2 years ago

GET /workspace/{workspace-id} (link: https://app.swaggerhub.com/apis/kba/ocr-d_web_api/0.0.1#/workspace/getWorkspaces) "only" returns id and static description, no data (Unless I'm misunderstanding). So users can see if a workspace/it's id is available, not fetch the data of the workspace like images, mets etc.

kba commented 2 years ago

Oh, right, there's the content negotiation missing. I would expect

curl -H Accept:application/json /workspace/{workspace-id}

to return the description but

curl -H Accept:application/vnd.ocrd+zip /workspace/{workspace-id}

to return the workspace itself. I'll have a look how that is modelled in OpenAPI

joschrew commented 2 years ago

I finished this story for now. The webapi (the part concerning the workspace) can be run in Docker and I added the file webapi-tests.postman_collection.json to simplify testing it with postman. See already provided link for the github repo: https://github.com/joschrew/ocrd-webapi-implementation The next step for further development I would suggest is to deal with the processing/processor part. Therefor I would at first try to understand the changes/additions that Triet did for the processors. Afterwards I would think about if and how that could be integrated in the webapi implementation I did for the workspace-part.

joschrew commented 2 years ago

I deployed the service to a test-server for now: 141.5.99.53:5050 for testing purposes. See this link for an overview about what is implemented: http://141.5.99.53:5050/docs. I added Mehmeds and Triets public keys so you should be able to login via ssh in case unexpected things happening (user is cloud).