orbeon / orbeon-forms

Orbeon Forms is an open source web forms solution. It includes an XForms engine, the Form Builder web-based form editor, and the Form Runner runtime.
http://www.orbeon.com/
GNU Lesser General Public License v2.1
514 stars 220 forks source link

Use relational tables for form data, instead of XML #1571

Open avernet opened 10 years ago

avernet commented 10 years ago

Rationale

Right now, the implementation of the persistence layer for relational databases stores form data and form definitions in the database as XML. This has the benefit of making the database schema fixed: it doesn't need to change as new forms are deployed, or existing forms modified.

We come to this RFE from the requirement to support the flat view feature in databases other than Oracle, as well as to support repeated grids and sections, which are currently not supported on Oracle. The reasoning has been as follows:

  1. We could extend the technique used currently, based on views, to work on DB2 and support repeated grids/sections. However, the same approach wouldn't work on MySQL. This means that we would end up with a feature only supported on some databases. Past experience has shown us that such features don't get properly maintained for a number of reasons. So instead, we want the flat view feature to be designed in such a way that we can offer it on all supported (relational) databases.
  2. MySQL doesn't support function indexes or indexes on XML data, so using a view won't lead to acceptable results. This means that, at least on MySQL, we need to have tables that contain the flattened data, and populate those tables when form data is saved. To do this this taking repeated grids and sections into account, for every form definition and version thereof, we need to have one or more tables. (The number of tables needed for a given form definition has no upper limit, as Form Builder don't limit the level of nesting for repeated sections.)
  3. It would be a shame to have to implement and support two ways in which the flattened data is maintained: one for MySQL and one for Oracle/DB2 (and maybe SQL Server). It is just better to use the same technique for all databases.
  4. For MySQL, we were thinking of maintaining index tables to speedup the summary page (#649). If we have the data already flattened, we don't need to do that, and can do the search on those tables.
  5. If we are storing all the data in flattened tables, it is a waste to also store it in XML. We can avoid this recreating the XML based on the data in the flattened tables. This also has the benefit of making those tables writable form third party applications.

    Steps

Since this is a fairly large task, we want to do it in steps, each step leading to deliverable software that brings value to customers.

  1. On publish, where we now create the view for Oracle, we will create a table. In this first step, we will take versioning into account, but not repeated grids and sections. On save, we will populate that table. Value for customer: getting the current flat view feature on all databases, with support for versioning.
  2. Support repeated grids and sections – We will create the additional tables for repeats on publish, and populate them on save.
  3. Reconstitute the XML based on data in the tables – After this, the tables orbeon_form_data and orbeon_form_data_attach will only be used by Form Builder.
ebruchez commented 10 years ago

Seems reasonable to me.

Since Form Builder still requires the XML-only scheme, we might consider supporting both schemes: XML and fully relational. It is annoying to have to support two, but that might be inevitable given Form Builder. The benefit is that some users might prefer a system which doesn't create new tables all over the place.

evlist commented 10 years ago

This also raises the question of versioning, drafts and auto save... Common designs in RDBMs do not support drafts and versioning (which would make SQL queries complex and slower). Does that mean that these features should not be available when using "flat tables"?

avernet commented 10 years ago

@evlist The plan would be to also support versioning and drafts with flat tables. Drafts are simple: we just need to keep a draft column, as we do now. For versioning, we need to create one flat table per version, and this on publish. It adds a little bit of complexity, but seems perfectly manageable.

ebruchez commented 10 years ago

2014-06-26 brainstorming:

avernet commented 8 years ago

Going through this again: