ODRC: implement Document creation/upload

Acceptance criteria

Copied/moved from #2

Het format wordt gevalideerd wanneer het bestand geüpload wordt.
Via een API kan een document (data-object) gecreëerd, geraadpleegd, gemuteerd en verwijderd worden.
Bij een document moet exact één bestand (bijvoorbeeld een PDF) worden opgeslagen.
De creatie, mutatie of verwijdering van een document wordt gelogd (zie #16)
De API specificatie is bijgewerkt (ReDoc, Swagger)
De documentatie is bijgewerkt (Read the Docs)

Document creation + uploads

We make use of the file parts mechanism of the Documenten API, always.

ODRC will proxy to the underlying documenten API

Preparing a document upload

The client must pass the necessary metadata and then based on the response of that, cut up the file upload in parts that can be submitted individually.

The request body schema of a Document POST would look something like:

Document:
  type: object
  required:
    - identifier
  properties:
    publicatie:  # UUID of the publication it belongs to
      type: string
      format: uuid
    identifier:  # the 'primary' identifier
      type: string
    creatiedatum:
      type: string
      format: date
    officieleTitel:  # DiWoo doesn't seem to apply a max length
      type: string
    verkorteTitel:
      type: string
    omschrijving:
      type: string
    # from waardelijst -> expose options in separate endpoint (!)
    # POST (write) operations should be able to just provide the identifier IRI instead of this complex object if ICATT desires this
    bestandsformaat:  
      type: object
      properties:
        identifier:  # IRI from waardelijst
          type: string
          format: uri
        mimeType:  # e.g. application/pdf
          type: string
        naam:
          type: string  # e.g. "PDF"

This translates to a request of ODRC -> Documenten API with schema:

EnkelvoudigInformatieobject:
  type: object
  properties:
    identificatie:  # primary identifier of Document OR let the Documenten API generate one?
      type: string
    bronorganisatie:  # fixed, global configuration parameter in ODRC initially, could become 'smart' in the future
      type: string
    creatiedatum:  # taken from Document.creatiedatum
      type: string
      format: date
    titel:
      type: string
      minLength: 1
      maxLength: 200
    auteur:  # taken from Document.publicatie.organization
      type: string
    status:
      const: definitief  # archiving will move this to gearchiveerd, later
    formaat:  # derived from Document.bestandsformaat
      type: string
    taal:  # derived from Document.taal -> convert to/from ISO 639-2/B
      type: string
      enum:
        - dut
        - eng
    bestandsnaam:  # taken from Document.bestandsnaam
      type: string
    bestandsomvang:  # taken from Document.bestandsomvang, prepares the file parts
      type: number
    indicatieGebruiksrecht:
      const: false
    informatieobjecttype:  # points to /catalogi/api/v1/informatieobjecttypen/:uuid for the Document.publicatie.informatiecategorie
      type: string
      format: uri

The Documenten API will return a lock and list of BestandsDelen for upload, each bestandsdeel will have the shape:

BestandsDeel:
  type: object
  properties:
    url:  # URL to PUT to
      type: string
      format: uri
    volgnummer:
      type: integer
    omvang:
      type: integer
    voltooid:
      const: false
    lock: # ??
      type: string

The ODRC will then expose endpoints for these part uploads so the publication component can upload the parts:

URL: /api/v1/documenten/:uuid/bestandsdeel/:index

A bestandsdeel wil simply be multipart/form-data, with the API key as auth header.

The request will be transformed by the ODRC, which adds the lock ID & JWT for the Documenten API, and streams the file part down to the Documenten API.

Once all parts are received, we unlock the created document.

Tasks

[x] Implement API endpoint/serializer for POST /api/v1/documenten
- Part of the metadata is stored in our database
- Other metadata that we can store in Documenten API, we store there
- Record the requested bestandsdelen from the Documenten API
- Record the lock ID to finalize the upload/creation
- Resolve the internal "Catalogi API" endpoint for the informatieobjecttype
[ ] Implement the API endpoint/serializer for PUT /api/v1/documenten/:uuid/bestandsdelen/:index
- Use the stored lock ID & other configuration/metadata to all the Documenten API /api/v1/bestandsdelen/:uuid
- If all parts are completely uploaded, unlock the document
- Emit in the response body documentFinalized: true|false so that the client can be informed that they can refresh the document resource if needed, as file uploads will likely happen in parallel
- Nice to have: check if we can make use of proxying/streaming (timebox: 1 day)
[ ] The first part of the document contains the magic bytes that can be used to validate against the format in the metadata. Check the upload validation in Open Forms for inspiration (and edge cases).

GeneriekPublicatiePlatformWoo / registratie-component

ODRC: implement Document creation/upload #19

Document creation + uploads

Preparing a document upload