mountetna / magma

Data server with friendly data loaders
GNU General Public License v2.0
5 stars 2 forks source link

Add /update_model endpoint #131

Open graft opened 4 years ago

graft commented 4 years ago

Introduction

Background

The current model development cycle in Magma goes like this:

First we plan:

  1. Add new models or alter existing models in the project repo.
  2. Load the new models into Magma; use the "plan" command to compute a migration, based on the difference between the current database schema and the current model defintion.

Then we modify the plan, i.e., add or remove atomic operations to the database

  1. Check the planned migration for errors (notably, the plan cannot distinguish between "rename column" and "drop column" + "add column"). Rewrite to fix errors and allow other data loading tasks.
  2. Run the migration.rb file to update the database.

Subsequently there is cleanup and deploy:

  1. Validate the model, perhaps by test-loading some data and building views in Timur
  2. Push code (models + migration changes) to production

Planning strategy vs. atomic operations strategy

There are two broad strategies we might use to exposing model-editing functionality:

1) Describe the new model, compute the differences required and make the appropriate changes This "plan" methodology allows us to quickly dump in a large template with perhaps hundreds of attributes. However, the plan is not trustworthy. While using it allows us to sketch in broad strokes, the actual migrations it generates must be validated and corrected before being run. This means the process above is dangerous: that is, it is not guaranteed to never accidentally destroy data. This requirement for validation of migrations also slows down the process of deploying new models.

2) Allow atomic operations to be executed by the user This allows us to sculpt in details, adding or removing attributes to reshape our existing models. But it would grow tedious to sketch out new projects this way, adding in attributes to a model one-at-a-time.

Examples

Here are some scenarios we might imagine using these operations:

The API

While broadly speaking we wish to plan AND use atomic operations, as both of these operations rely on executing a series of atomic operations on the data graph, ultimately this is what the /update_model API should provide - any planning services Magma might offer are worth considering in this context, but they might happen elsewhere, e.g. in a /plan API or outside Magma entirely. Instead we will focus on offering an API that provides atomic operations to change the data graph.

The operations

The basic operations we might require are:

1) add a model { action: 'add_model', model_name: 'victim', parent_model_name: 'monster', parent_link_type: 'collection' } 2) remove a model { action: 'remove_model', model_name: 'victim' } 3) rename a model { action: 'rename_model', model_name: 'victim', new_model_name: 'casualty' } 4) add an attribute { action: 'add_attribute', model_name: 'monster', attribute_name: 'species', ...required_attribute_params } 5) remove an attribute{ action: 'remove_attribute', model_name: 'monster', attribute_name: 'species' } 6) rename an attribute{ action: 'rename_attribute', model_name: 'monster', attribute_name: 'species', new_attribute_name: 'species_name' } 7) update an attribute{ action: 'update_attribute', model_name: 'monster', attribute_name: 'species', validation: { type: 'Regexp', value: '/^[A-Z][a-z-]+ [a-z-]+$/' }, format_hint: 'Linnean species name, e.g. "Sus scrofa"' } 8) cast an attribute{ action: 'cast_attribute', model_name: 'monster', attribute_name: 'weight', type: 'float' }`

Each of these actions can be classified (or has some sub-actions mostly in the case of #5) as either:

We may add in actions in this order, so we can start fleshing out the migration API without having to deal with issues of data loss.

The controller

The post from the user is like: { project_name: 'labors', actions: [ { action_name: 'add_model', ... }, ... ] }

The controller will: 1) Examine each of the actions and validate its required arguments are well-formed 2) Simulate the serial application of these actions to the data graph. Record the intermediate states. 3) Look for actions that produce incorrect states (e.g., they attempt to reference an attribute after it has been renamed, or they yield an orphaned subgraph, etc.). Collect the errors and report back to the user. 4) If there are no invalid states produced, look for destructive-migrating actions. Compute a confirmation hash for these actions, collect, and report back to the user. The user must then re-post the request with the confirmation hashes attached: actions: [ ..., { action_name: 'remove_model', confirmation: 'd9d9d9d9d' }, ... ] 5) If all of the destructive actions have been confirmed, apply the actions in order to the data graph. Any actions requiring migrations will perform them. Any data-destroying actions will be sure to archive the destroyed data (described elsewhere).

alimi commented 4 years ago

When updating an attribute, will the payload include all the attributes or just the attributes that are being changed?