populse / capsul

Collaborative Analysis Platform : Simple, Unifying, Lean
Other
7 stars 14 forks source link

Processing database persistence #292

Closed sapetnioc closed 10 months ago

sapetnioc commented 10 months ago

To date Capsul processing database (Redis) is in memory and completely erased when Capsul client and workers are ended. The default configuration should have a persistence strategy to allow a user to check workflow status even if all processes is terminated. The idea would be to store database in $HOME/.config/{app_name}/default.rdb where app_name is the name given during Capsul object creation (value is capsul by default).

sapetnioc commented 10 months ago

After making database persistent by default, it is necessary to have a short way to create a Capsul instance with a non persistent database. I propose to add a database_path option to Capsul() that would be a shortcut to:

capsul = Capsul()
capsul.config.databases['builtin']['path'] = database_path

Using an empty path (i.e. Capsul(database_path='')) would mean that no database file must be left on the system (useful for tests for instance). The first implementation will be to keep database snapshots but to put them in an already existing temporary directory linked to the server and that it destroyed when the server is shutdown.

sapetnioc commented 10 months ago

Well, making the database persistent was not enough, it was also necessary to keep the content of the database, i.e. the execution reports. Now execution engines have a new persistence boolean option in their config. It is True by default which means that all execution reports are kept in database and can be inspected later. When persistent is False executions are immediately removed from database when these two conditions are met: