Open lomky opened 6 years ago
The main wrench in the works here is not all distinct tables should be considered connections. To give a list from clearest "internal seperate table" to "maybe?":
Seem likely to be more internal than connection:
Less Certain Internal vs Connection:
Maybe we should consider Connections only between certain high level types? i.e. Publication to Publication, Contributor to Contributor, and Contributor to Publication?
We concur the types and Country are internal. Same for org alt names.
Files
Since these are relatively flat objects with a exist / doesn't exist, we think this should go in internal.
For example, a URL and a file are about equivalent pronenance-wise, so we want to compare them apples to apples.
Array
Should be connection, not internal to Table, as Arrays have semantic connections available to them.
GCMD Keywords & Region
These help "connect" things, but not in a provenance way.
We think these should be scored internally and with connection.
Contributor Object
This shouldn't have an internal score, as it is the embodiment of the connection score for a person-org-publication
publication object This shouldn't have an internal score, as it is the embodiment of the connection score between the publication entities and the things that can be connected to them.
organization relationship object This shouldn't have an internal score, as it is the embodiment of the connection score between two organizations
NB: Should we have the Figure store its number of panels?
Still to do: come up with the weights for the various connections
For 7/11.
At a high level, what do we want to consider for the internal score of an object? For the connection score? What exceptions do we want to make for strange objects (ie contributor, publication).