Support repeats in relational flat views

avernet commented 11 years ago

Right now, this flat view feature is only available for Oracle, and it doesn't support repeats. The current Oracle view is produced from a query of the form:

select
    document_id metadata_document_id, created metadata_created, last_modified metadata_last_modified, username metadata_username
        , extractValue(xml, '/*/personal/first-name')
        "PERSONAL_FIRST_NAME"
        , extractValue(xml, '/*/personal/last-name')
        "PERSONAL_LAST_NAME"
        , extractValue(xml, '/*/personal/phone')
        "PERSONAL_PHONE"
        , extractValue(xml, '/*/personal/number')
        "PERSONAL_NUMBER"
        , extractValue(xml, '/*/personal/type')
        "PERSONAL_TYPE"
from (
    select d.*, dense_rank() over (partition by document_id order by last_modified desc) as latest
    from orbeon_form_data d
    where
        app = 'datadec'
        and form = 'repeat-true'
    )
where latest = 1 and deleted = 'N'

If we want the same technique to work on all relational databases, and in particular MySQL which doesn't know how to index XML data or support function indices, we most likely have to create other tables where the data

ebruchez commented 10 years ago

Nested sections are not supported either.

ebruchez commented 10 years ago

For 4.5, the flat view code has moved to Scala, see FlatView. We now explicitly exclude controls which are within nested sections and repeats.

ebruchez commented 10 years ago

See also #1800 for DB2 and other databases support.

2014-06-26 brainstorming:

we keep storing data in XML
to solve search, we might use Solr (#1801)
we support repeats by creating new tables
we only write or delete from these tables, never read

ebruchez commented 10 years ago

Also support versioning.

evlist commented 6 years ago

+1 from customer (for Postgresql). Any update on this RFE ?

avernet commented 6 years ago

@evlist It's still on our radar, but there is nothing new on this at this point.

acspike commented 6 years ago

I make views similar to the following for each repeat by hand:

CREATE OR REPLACE VIEW orbeon_default."AppNameFormNameGridName" AS 
 WITH a AS (
         SELECT d.document_id AS metadata_document_id,
            d.created AS metadata_created,
            d.last_modified_time AS metadata_last_modified_time,
            d.last_modified_by AS metadata_last_modified_by,
            generate_subscripts(xpath('/*/section//grid'::text, d.xml), 1) AS idx,
            unnest(xpath('/*/section//grid'::text, d.xml)) AS elt
           FROM orbeon_default.orbeon_form_data d,
            ( SELECT max(d_1.last_modified_time) AS last_modified_time,
                    d_1.app,
                    d_1.form,
                    d_1.document_id
                   FROM orbeon_default.orbeon_form_data d_1
                  WHERE d_1.app::text = 'AppName'::text AND d_1.form::text = 'FormName'::text AND d_1.draft = 'N'::bpchar
                  GROUP BY d_1.app, d_1.form, d_1.document_id) m
          WHERE d.last_modified_time = m.last_modified_time AND d.app::text = m.app::text AND d.form::text = m.form::text AND d.document_id::text = m.document_id::text AND d.deleted = 'N'::bpchar
        )
 SELECT a.metadata_document_id,
    a.metadata_created,
    a.metadata_last_modified_time,
    a.metadata_last_modified_by,
    a.idx AS metadata_idx,
    (xpath('//control1/text()'::text, a.elt))[1]::character varying AS control1,
    (xpath('//control2/text()'::text, a.elt))[1]::character varying AS control2
   FROM a;

evlist commented 6 years ago

@acspike Thanks for the example !

hoila commented 6 years ago

Last year a trainee made a java program to create the views for the repeated sections and the repeated grids for oracle. He used the xmltable function. For the primary key we use the id field of the orbeon_form_data and the rownum. The subsections field contains the xml of the repeated grid, in case the repeated section contains also a repeated grid.

The orbeon form builder doesn't permit duplicate control names. So we use in the form for the control names the field names of the table and use then in the view the control names.

Example sql for a repeated section

CREATE OR REPLACE VIEW test_view_repeated_section AS
SELECT 
    rownum AS SECTION_NUMBER,
    x.id AS orbeon_form_data,
    xt."FIELD1",
    xt."FIELD2",
    xt."SUBSECTIONS"
FROM 
    orbeon.orbeon_form_data x,
    XMLTABLE(
        '/form/section-1/section-1-iteration' PASSING x.xml
        COLUMNS
            FIELD1 VARCHAR2(20) PATH 'field1',
            FIELD2 VARCHAR2(20) PATH 'field2',
            SUBSECTIONS XMLType PATH '[not(count()=0)]'
    ) xt,
    (
        SELECT 
            max(last_modified_time) last_modified_time,
            app,
            form,
            document_id
        FROM 
            orbeon.orbeon_form_data d
        WHERE 
            app = 'appname'
            AND form = 'formname'
            AND draft = 'N'
        GROUP BY 
            app, 
            form, 
            document_id
    ) m
WHERE 
    x.last_modified_time = m.last_modified_time
    AND x.app = m.app
    AND x.form = m.form
    AND x.document_id = m.document_id
    AND x.deleted = 'N';

ebruchez commented 6 years ago

+1 from evaluator

ebruchez commented 5 years ago

+1 from evaluator

ebruchez commented 3 years ago

The Multiple File Attachments control is not supported either.

+1 from customer

avernet commented 1 year ago

+1 from customer

obruchez commented 6 months ago

[x] doc

obruchez commented 6 months ago

[x] understand why we have a problem with PostgreSQL 9.5 (which is unsupported since 2021)

avernet commented 6 days ago

Form to test names generated for the views and their columns:

To get the list of the DROP statements, run:

SELECT 'DROP VIEW ' + QUOTENAME(SCHEMA_NAME(schema_id)) + '.' + QUOTENAME(name) + ';'
FROM sys.views 
WHERE name LIKE 'orbeon_f_a_a_1%';

We can get the list of the views and their columns with the following SQL.

SELECT LOWER(o.name) AS view_name,
       LOWER(c.name) AS column_name,
       t.name AS data_type,
       c.max_length
FROM sys.columns c
JOIN sys.objects o ON c.object_id = o.object_id
JOIN sys.types t ON c.user_type_id = t.user_type_id
WHERE o.type = 'V'
  AND o.name LIKE 'orbeon_f_a_a_1%'
ORDER BY o.name, c.column_id;

A case - On 2024.1 with fully-qualified-names set to false:

      view_name      |         column_name         | data_type | max_length 
---------------------+-----------------------------+-----------+------------
 orbeon_f_a_a_1      | metadata_document_id        | nvarchar  |        510 
 orbeon_f_a_a_1      | metadata_created            | datetime2 |          8 
 orbeon_f_a_a_1      | metadata_last_modified_time | datetime2 |          8 
 orbeon_f_a_a_1      | metadata_last_modified_by   | nvarchar  |        510 
 orbeon_f_a_a_1      | s1g1f1                      | varchar   |       8000 
 orbeon_f_a_a_1_s1g2 | metadata_document_id        | nvarchar  |        510 
 orbeon_f_a_a_1_s1g2 | s1g2_repetition             | bigint    |          8 
 orbeon_f_a_a_1_s1g2 | s1g2f2                      | varchar   |       8000 
 orbeon_f_a_a_1_s2   | metadata_document_id        | nvarchar  |        510 
 orbeon_f_a_a_1_s2   | s2_repetition               | bigint    |          8 
 orbeon_f_a_a_1_s2   | s2g3f3                      | varchar   |       8000

B case - On 2024.1 with fully-qualified-names set to true:

           view_name            |         column_name         | data_type | max_length 
--------------------------------+-----------------------------+-----------+------------
 orbeon_f_a_a_1                 | metadata_document_id        | nvarchar  |        510 
 orbeon_f_a_a_1                 | metadata_created            | datetime2 |          8 
 orbeon_f_a_a_1                 | metadata_last_modified_time | datetime2 |          8 
 orbeon_f_a_a_1                 | metadata_last_modified_by   | nvarchar  |        510 
 orbeon_f_a_a_1                 | s1_s1g1f1                   | varchar   |       8000 
 orbeon_f_a_a_1_s1_s1g2         | metadata_document_id        | nvarchar  |        510 
 orbeon_f_a_a_1_s1_s1g2         | s1g2_repetition             | bigint    |          8 
 orbeon_f_a_a_1_s1_s1g2         | s1g2f2                      | varchar   |       8000 
 orbeon_f_a_a_1_s2_s2_iteration | metadata_document_id        | nvarchar  |        510 
 orbeon_f_a_a_1_s2_s2_iteration | s2_repetition               | bigint    |          8 
 orbeon_f_a_a_1_s2_s2_iteration | s2g3f3                      | varchar   |       8000

C case - On 2023.1.5:

   view_name    |         column_name         | data_type | max_length 
----------------+-----------------------------+-----------+------------
 orbeon_f_a_a_1 | metadata_document_id        | nvarchar  |        510 
 orbeon_f_a_a_1 | metadata_created            | datetime2 |          8 
 orbeon_f_a_a_1 | metadata_last_modified_time | datetime2 |          8 
 orbeon_f_a_a_1 | metadata_last_modified_by   | nvarchar  |        510 
 orbeon_f_a_a_1 | s1_s1g1f1                   | nvarchar  |        510

avernet commented 6 days ago

Suggested changes:

Property name: when fully-qualified-names is enabled, names are prefixed with section names, so I think it would make sense to rename this property to prefix-with-section-name.
Prefix all columns: when prefix-with-section-name is set to true, also prefix the column names of the sub tables with the section name.
- Impact in the above examples
  - B case
    - s/s1g2_repetition/s1_s1g2_repetition/
    - s/s1g2f2/s1_s1g2f2/
    - s/s2g3f3/s2_s2g3f3/
  - No impact on A and C cases
Don't include iteration name: see red note in annotated result below

avernet commented 5 days ago

obruchez commented 5 days ago

[ ] rename property?
[ ] prefix all columns?
[ ] do not include iteration name

obruchez commented 3 days ago

A few months later, I don't know why exactly I insisted on including the iteration name in the view/column names. Maybe this came up in a discussion?

The code seems to be wrong anyway, as the iteration name is built from DefaultIterationSuffix instead of being extracted from the definition/binding.

The following code:

val addIterationNameToPath = section && repeated
val iterationNameOpt       = addIterationNameToPath.flatOption(nameToAddToPathOpt.map(_ + DefaultIterationSuffix))
val currentPath            = path ++ nameToAddToPathOpt.toList ++ iterationNameOpt.toList

can simply become:

val currentPath            = path ++ nameToAddToPathOpt.toList

obruchez commented 3 days ago

I'm a bit less certain about the other point (i.e. prefixing all view/column names with the full name/path if fully-qualified-name=true). On the one hand, consistency is good. On the other hand, we will introduce unnecessary long names that were not present before (since we didn't have flat views for repeated grids/sections). And we already have this rather confusing shortening algorithm that produces names that are hard to parse by humans. In your example, we have short section/grid/control names, but in real cases, the names will be longer and will probably have to be shortened.

If we decide to prefix everything if fully-qualified-name=true, we'll have to conditionally call FlatView.relativePath (two locations, one for view names, one for column names). We'll also have to prefix the repetition number column.

obruchez commented 3 days ago

Also, regarding the property name: the idea is that we include the whole "path"/prefix in the names if fully-qualified-name=true. This works for nested sections/grids. I know that in real case scenarios, we probably don't have many situations with nested sections/grids (i.e. more than 2 levels). but prefix-with-section-name seems to imply that only the closest section will be included in the name, whereas all "ancestor" sections are included. Repeated grid names are also included.

orbeon / orbeon-forms

Support repeats in relational flat views #1069