time-link / timelink-py

Timelink Python Package
MIT License
3 stars 0 forks source link

Store the group definitions exported by Kleio Server #29

Open joaquimrcarvalho opened 8 months ago

joaquimrcarvalho commented 8 months ago

Up to now there is no representation in Timelink of the Kleio Schema used to parse the original file.

From the export file produced by the Kleio Server Timelink gets the name of the group that originated each entity, attribute or relations, the order inside the file, the level of nesting and the mapping to database columns. But it does not get other information present in the original "str" file, like compulsory and optional elements, posicional elements, parent group, and others.

That information is not needed to import the transcription to the database but at least the posicional elements, and their order would greatly improve the rendering of database entities in Kleio notation Entity.to_Kleio().

Since the Kleio server exports the structure information and produces a FILE.files.json for every FILE.cli translated, it is now possible to read the structure definition for each file imported.

(actually it may be easier to insert the group definition as a JSON literal in the XML export file).

Where to store it?

Note that in a database different versions of the same groups might have been used in the translation process of different files. So the association between group definition and database entities is identified by FILE+Group.

Also note that many groups are just aliases for other groups, so an algorithm processing this information must be able to process the inheritance hierarchy of Kleio group definitions, probably by reading the json file, consolidate the definition of each group, and store the definition of the groups in the file in the database.

As to what regards where to store the group definitions, it could be on each entity or in a linked table of kleio_imported_files:

     CREATE TABLE group_definitions AS (
             file_name. VARCHAR,
             group_name VARCHAR,
             group_str JSON
             )
     CREATE UNIQUE INDEX key ON group_definitions (file_name,group_name)

But it would be costly to fetch since we need to map a given entity to a file. Alternatively denormalising and storing together with the entity record.

On Json representation of Kleio structure see https://github.com/time-link/timelink-kleio/issues/23