time-link / timelink-py

Timelink Python Package
MIT License
3 stars 0 forks source link

Add TimelinkDatabase.get_group_model(group name). #53

Closed joaquimrcarvalho closed 2 weeks ago

joaquimrcarvalho commented 2 months ago

Description

it is not always possible to get the ORM corresponding to a group because a group can extend another one and not have its own mapping. However internally Timelink knows the model to use, but the end user need to inspect the structure file to figure out.

In str

pars name=lugar; source=geoentity

in python

db.get_model("lugar") # returns none

Either implement get_group_model("lugar") or make get_model smarter.

joaquimrcarvalho commented 2 months ago

Also, allow for a list

crono_model, lugar_model = db.get_model(['crono', 'lugar'])

The advantage would be that the models would be aliased, implementing https://docs.sqlalchemy.org/en/20/errors.html#error-xaj2 and avoiding the current warning and future deprecation of mixing hierarchical models that generate ambiguous reference to top level tables, in our case, table "entities".

joaquimrcarvalho commented 2 months ago

Making progress in branch issue53 but there is a design problem: builtin mappings are never replaced during import. In the test case a group "lugar" is mapped to "geoentity".

      <CLASS NAME="geoentity" SUPER="entity" TABLE="geoentities" GROUP="lugar">
            <ATTRIBUTE NAME="id" COLUMN="id" CLASS="id" TYPE="varchar" SIZE="64" PRECISION="0"
                  PKEY="1"></ATTRIBUTE>
            <ATTRIBUTE NAME="type" COLUMN="the_type" CLASS="type" TYPE="varchar" SIZE="32"
                  PRECISION="0" PKEY="0"></ATTRIBUTE>
            <ATTRIBUTE NAME="name" COLUMN="name" CLASS="name" TYPE="varchar" SIZE="64" PRECISION="0"
                  PKEY="0"></ATTRIBUTE>
            <ATTRIBUTE NAME="obs" COLUMN="obs" CLASS="obs" TYPE="varchar" SIZE="16654" PRECISION="0"
                  PKEY="0"></ATTRIBUTE>
      </CLASS>

This information is used to correctly store data from "lugar" groups in the table geoentities, but, since the geoentities class is builtin, the current code does not store this new association between group name ("lugar") and class name ("geoentity") and table ("geoentities").

Note that in this case no new table and no new ORM model are created for "lugar". The import code simply sends the data to the existing "geoentities" table. Only when a new table is needed and a new ORM model for it, does the system creates a new PomSomMapper and persists its information in the database for later retrieval.

But the system needs be know after import where to look for the data associated with "lugar". Currently the mapping is kept as each entity is stored in the entities tables, along with the class (ORM) name and the groupname. So each "lugar" is stored in Entities with class = "geoentity" and group name = "lugar"

SELECT distinct class, groupname
FROM entities;

Gives all the correspondences.

Solutions:

  1. Entity keeps a map of group to ORM, dynamically. This map is build with the SQL above during startup (ensure_mapping, maybe) and updated at each import (ensure_mapping is called after each CLASS data is read from xml).
  2. a PomSomMapper class is created for each new combination of SOM group and POM Entity. However, no new table is created for the POM Entity if it already exists. Maybe this is the only way to preserve the mapping at the attribute level: SOM elements to POM columns.
  3. At each step that mappings are update a select distinct class, groupname from entities is done and the result stored in the Entity class as a group to ORM dictionary, and used to find the ORM of a given group when needed.

Link this to the issue about storing group definitions in the database #29

joaquimrcarvalho commented 2 months ago

Do a code review of pom_som_mapper.