odpi / egeria-database-connectors

Connectors for exchanging metadata
Apache License 2.0
16 stars 10 forks source link

Enhancement: Support selection of schemas to sync/import, exclude system #78

Closed planetf1 closed 3 years ago

planetf1 commented 3 years ago

Currently the Postgres connector will import all metadata from all databases & all schema, including system tables.

I think it would be useful to

a) Eliminate system objects

There are many tables & other database elements that postgres uses 'internally'. Some tools such as the database plugin in IntelliJ allow the user to select 'Load sources for: All excl. system schemas'. This is the default, but can be switched simply All.

The same capability would be useful in our Postgres connector so that we only catalog 'real' databases rather than system detail. That's not to say in some cases the internals aren't wanted, but perhaps less often

b) Allow specification of which databases/schemas to include

This is a followon from a) where in addition to excluding system schemas we have some approach to include/exclude databases or schema by pattern.

planetf1 commented 3 years ago

Related - for supporting metadata exchange with multiple postgres endpoints (ie the opposite to filtering!) this should already be possible (not tested) since integrationConnectorConfigs is a list of configs, and we should be able to have one per endpoint on the same integration server & using the same OMAS

wbittles commented 3 years ago

All that would happen is that the application developer would filter out the it departments meta-data. What if we just let the people who aren't interested in the data filter it out by not reading it , rather than Egeria not actually capturing it, this would also spare us from the whole philosophical "what is real" decision.

mandy-chessell commented 3 years ago

I agree with Billy - the filtering should be done on the consuming side - not the capturing side.

Today there are the governance zones to provide some level of scoping and filtering. Each OMAS deployment sets up the default zones of the metadata captured. This is the value of the zones assigned to an asset when it is created. The integration connector is responsible for calling publishXXX to move the metadata from the default zones to the published zones. (And withdrawXXX to move it from the published zones to the default zones when editing the metadata.)

If we discover that we need more granularity in the zoning we can add support for the default/published zones on either the OMIS and/or the integration connector definition.

planetf1 commented 3 years ago

ok. closing