Battery-Intelligence-Lab / galv

An open-source platform for automated storage of battery data with advanced metadata support
https://battery-intelligence-lab.github.io/galv/
Other
33 stars 8 forks source link

EIDF workers #34

Open mjaquiery opened 1 year ago

mjaquiery commented 1 year ago

Parts of Galvanalyser functionality should be separated out for deployment within the EIDF. A key part of this is monitoring battery data file uploads and processing those files.

This will be done on a cluster by creating individual workers who check out and process individual files, creating records in the Galvanalyser database and sending completed records to the EIDF CKAN service for recording and labelling as related to the original file records.

This process is the Worker component in the diagram below.

restructure

martinjrobins commented 1 year ago

Hopefully there shouldn't be much EIDF specific stuff. They are still monitoring a directory (just that this directory is in the data lake). The "checkout" functionality is new, but this would be useful for standard galvanalyser as well I think. Probably the "send processed data to CKAN" is the main EIDF specific stuff, can we make this generic somehow (e.g. set url and payload in .env)?

mjaquiery commented 1 year ago

Yeah, good idea to make this workflow as generic as possible. Currently harvesters send metadata before they send file data (which creates the dataset), so we could move dataset creation earlier as a checkout step and then populate the metadata and data later.