This bug affects the migration of nested metadata of CRIS entites in DS CRIS 2023.01, especially the handling of optional nested metadata fields, e.g. affiliationstartdate or affiliationenddate.
The step entities nested placeholder in Pentaho transformation entity_migration.ktr is responsible for handling optional nested metadata fields.
The result of the underlying SQL query
SELECT DISTINCT
nested_object.parent_id,
nested_object.positiondef,
properties.parent_id as nested_object_id,
--property_def.id AS pdef_id,
--original pdef from where the value is caluclated. Remove because of DISTINCT clause
--property_def2.id AS pdef_id_missing,
properties.visibility,
property_def2.shortname,
property_def2.value AS textvalue
FROM
(
(SELECT * FROM "public".${NESTED_PROP}) AS properties
Left JOIN
(SELECT * FROM "public".${NESTED_TABLE}) AS nested_object ON nested_object.id = properties.parent_id
Left JOIN
(SELECT * FROM "public".${NESTED_PROP_DEF}) AS property_def ON property_def.id = properties.typo_id
Left JOIN
(SELECT * FROM "public".${NESTED_TABLE}_tp2pdef) AS tp2 ON tp2.${PLACEHOLDER_TP} = nested_object.typo_id AND tp2.${PLACEHOLDER_PDEF} != property_def.id
-- CAN't use variables because they contain old_, thus introducing Placeholde variables
--join with non-existing values where not exist
Left JOIN
(SELECT '#PLACEHOLDER_PARENT_METADATA_VALUE#' AS value, shortname, id FROM "public".${NESTED_PROP_DEF}) AS property_def2 ON property_def2.id = tp2.${PLACEHOLDER_PDEF} AND NOT EXISTS(SELECT * from "public".${NESTED_PROP} AS pdef3 WHERE pdef3.parent_id = properties.parent_id AND pdef3.typo_id = property_def2.id)
)
WHERE property_def2.id IS NOT NULL
ORDER BY parent_id ASC;
contains unexpected rows (too many) which subsequently lead to an incorrect migration of RP's affiliations.
The bug arises if a RP has at least one affiliation which does not contain values in all nested metadata fields.
The example below shows what is going wrong in detail.
3 affiliations of a RP in our DS CRIS 5 instance (two of them do not provide values in all fields)
Expected behaviour
Actual (incorrect) behaviour
Behind the scenes the Pentaho transformation generates two additional rows in imp_metadatavalue which can be found in the "technical view":
As you can see there are 4 fields of type oairecerif.affiliation.startDate and oairecerif.affiliation.endDate although there are 3 affiliations only.
A deeper analysis shows that this bug appears if an affiliation (one row in the UI's affiliations table) has nested metadata fields with different visibilities.
This bug affects the migration of nested metadata of CRIS entites in DS CRIS 2023.01, especially the handling of optional nested metadata fields, e.g.
affiliationstartdate
oraffiliationenddate
.The step
entities nested placeholder
in Pentaho transformationentity_migration.ktr
is responsible for handling optional nested metadata fields.The result of the underlying SQL query
contains unexpected rows (too many) which subsequently lead to an incorrect migration of RP's affiliations.
The bug arises if a RP has at least one affiliation which does not contain values in all nested metadata fields.
The example below shows what is going wrong in detail.
3 affiliations of a RP in our DS CRIS 5 instance (two of them do not provide values in all fields)
Expected behaviour
Actual (incorrect) behaviour
Behind the scenes the Pentaho transformation generates two additional rows in
imp_metadatavalue
which can be found in the "technical view":As you can see there are 4 fields of type
oairecerif.affiliation.startDate
andoairecerif.affiliation.endDate
although there are 3 affiliations only.