Deadwood-ai / deadwood-api

Main FastAPI application for the deadwood backend
GNU General Public License v3.0
0 stars 0 forks source link

Outsource processing to seperate processing Server #48

Open JesJehle opened 2 months ago

JesJehle commented 2 months ago

We need to think about how to, bring the changes we made in the direct-cog branch to the storage server. And how we implement the processing sustainably.

merge

Regarding the merge, I would suggest the following. There are several modifications to the code regarding the data model. https://github.com/Deadwood-ai/deadwood-api/blob/65212d9cd9e61942417e5ae9072057392a220286/src/models.py So to be able to use the metadata route we need to merge these changes. Since we don't want the force-direct-cog open on the live system, i suggest to remove the route simply.

implement processing.

Because of the resource contains of the storage server, it makes a lot of sense to outsource the processing to a dedicated processing server instead of running into all the trouble again. geosense has a local server we could use for this. The server could run all the resource intensive processes:

All this should be in a separate repo. I would implement it as a package / docker container scheduled by cron, not a REST API. Since the processing server is behind a vpn and the communication needs to be a one-way street from the processing server to the storage server.

The storage server would still implement:

The current queuing system could be used to manage the processes. The process would be the following:

  1. data is uploaded to the storage server via the datasets route
  2. metadata is generated via the metadataroute
  3. the storage server adds a process in the current queuing system (supabsae table)
  4. The processing server scans the queue, and if a new process is found, download the tiffs to process them and upload them again. The logic of the current cog and thumbnail generation can be completely reused, also the different states apply.

What do you think? @mmaelicke @cmosig

mmaelicke commented 2 months ago

I think this was layed out pretty well and we can go for it. I have a few comments to point 4, but only minor stuff. Ich ruf morgen an

JesJehle commented 2 months ago

As next steps, I would suggest: