digdir / roadmap

Felles veikart for Digdir sine produkter.
https://github.com/orgs/digdir/projects/8/
6 stars 2 forks source link

Analyse options for sharing data across instances and applications #5267 #375

Open nkylstad opened 4 months ago

nkylstad commented 4 months ago

Overordnet beskrivelse

Beskrivelse kopiert fra altinn-studio issue. Se kommentarer og diskusjoner på det issuet.

Sometimes the same file attachments are relevant across multiple instances and users. For now the attachment is closely bonded to an instance, thus leading to a need to upload identical files for each time the attachment is relevant.

The service owner should upload a file that is possible to reference across multiple instances and users. A specific use case:

In scope

Considerations

Out of scope

What's out of scope for this analysis?

Constraints

Constraints or requirements (technical or functional) that affect this analysis.

Analysis

Where to store the files

Alternative A

Within the application owner's storage account [org]altinn[env]strg01 and container used for appdata [org]-[env]-appsdata-blob-db a new section is created for shared files (fileShare or maybe there's a more suitable name without other connotations). Putting in a new section before adding folders for the categories to make it easier to tell the instanceData appart from the shared data. Also if a category matches an appId we would have an issue.

Alternative B

A new storage account is created for each application owner solely dedicated for shared files.

Blob container structure for the fileshare:

fileShare
    |-- category
    |   |   |-- dataGuid
    |   |   |   |-- dataBlob
    |   |   |   |-- fileInfo.json
    `-- category
    |   |   |-- dataGuid
    |   |   |   |-- dataBlob
    |   |   |   |-- fileInfo.json
    `-- nabovarsel
    |   |   |-- dataGuid
    |   |   |   |-- dataBlob
    |   |   |   |-- fileInfo.json

We need to have some metadata about the blob as well such as

Where to store metadata about the files

Possible options here:

Storing in storage account

Storing in Cosmos

PostgreSQL

What would separating this logic into a new platform component look like?

Wrt. to performance and maintainability, introducing a new platform component rather than using Platform Storage wouldn't have any large effect, and the end-user will not know the difference.

A new platform component is introduced Platform Data / Platform Fileshare / Platform [insert descriptive component name]. The purpose of this component would be to expose endpoints for storing and managing data not directly related to an instance (i.e. not form data or attachments for a single instance).

The platform component would require a link to authentication (well known endpoint + redirect for missing auth) and authorization (PDP).

To make this platform component open for further extension we should spend some time figuring out how to create the link to the storage account in a generic was so that any storage account can be used in the future. For retrieving data the blob storage path should be helpful. When storing data we would need to determine the link to a storage account based on something else.

My largest concerns about using a new platform component would be that we don't design it in a way that limits which future cases it could support.

What process is required for an app owner to store the file?

A new endpoint must be exposed in the platform component POST: %/api/v1/data/{org}/{category}

Authorization could entail matching orgClaim in claims principal to org in route, or introducing a new scope in maskinporten. If the categories should be possible to nest, I think the category parameter must be a query param in order to allow "/".

FileInfo is created based on metadata in the request and the blob is stored in the fileshare section of the app owner's storage account. This is a good time to implement a blobService that doesn't hold any logic. The job of composing the storage path should be extracted from the blobClient service.

Response contains the fileInfo JSON structure with

Managing and querying files in the file share

How to link file to an instance

authorization on org Endpoint exposed through the application. HTTP Post / HTTP Put org/app/instances/{instanceId}/data/link? Query params (required a + b or c) a) category b) dataGuid c ) blobStoragePath

The suggested flow is as follows

  1. Retrieve fileInfo and ensure valid dataType is being linked to the instance.
  2. Check that upload doesn't break any constraints e.g. number of elements of the dataType at given task.
  3. Generate dataElement based on known info about the data with a link to the instance, and store in Platform Strage
  4. Return info to the client.

STEP 1 - Ensure valid data type

Should be handled by the application.

STEP 2 - Check if upload doesn't break constraints

Could be handled at this point before upload is attempted or during validation. As a user I would prefer being notified during upload, but if there are arguments to not stop the upload, this option should also be considered.

STEP 3 - Generate & store new dataElement

This responsibility lie with the app If in app: endpoint in storage for linking will take a dataElement as input. If in storage: endpoint in storage for linking will take fileInfo / metadata parameters as input.

STEP 4 - Return info to the client

What should be returned? The full instance or the newly created dataElement?

What is the process for unlinking a shared file from an instance?

HTTP Delete org/app/instances/{instanceId}/data/link? Query params (required a + b or c) a) category b) dataGuid c ) blobStoragePath

Deletes dataElement from cosmos, but nothing else.

How to retrieve file as an enduser

Existing Get method in platform component is used. Org, app, instance, dataGuid as input. Authorization: if access to read instance & shared blob is linked to the instance, user is allowed to read the shared data.

How to ensure that shared file is not deleted during cleanup

Check if filepath contains a key word, if so, do not delete blob, simply delete the dataElement from CosmosDb.

How to handle in localtest

Based on all suggestions a solution for localtest will be possible to support. Won't specify this at the current moment.

Conclusion

Short summary of the proposed solution.

Tasks