USGCRP / gcis-conventions

Repository for the collection, management, and versioning of the GCIS data management conventions.
https://usgcrp.github.io/gcis-conventions/
1 stars 0 forks source link

Metrics: Connection Score vs Internal Score High Level Discussion #29

Open lomky opened 6 years ago

lomky commented 6 years ago

For 7/11.

At a high level, what do we want to consider for the internal score of an object? For the connection score? What exceptions do we want to make for strange objects (ie contributor, publication).

lomky commented 6 years ago

The main wrench in the works here is not all distinct tables should be considered connections. To give a list from clearest "internal seperate table" to "maybe?":

Seem likely to be more internal than connection:

Less Certain Internal vs Connection:

Maybe we should consider Connections only between certain high level types? i.e. Publication to Publication, Contributor to Contributor, and Contributor to Publication?

lomky commented 6 years ago

Discussion notes

We concur the types and Country are internal. Same for org alt names.

Files Since these are relatively flat objects with a exist / doesn't exist, we think this should go in internal.
For example, a URL and a file are about equivalent pronenance-wise, so we want to compare them apples to apples.

Array
Should be connection, not internal to Table, as Arrays have semantic connections available to them.

GCMD Keywords & Region
These help "connect" things, but not in a provenance way.
We think these should be scored internally and with connection.

Contributor Object
This shouldn't have an internal score, as it is the embodiment of the connection score for a person-org-publication

publication object This shouldn't have an internal score, as it is the embodiment of the connection score between the publication entities and the things that can be connected to them.

organization relationship object This shouldn't have an internal score, as it is the embodiment of the connection score between two organizations

What do we mean by connection?

NB: Should we have the Figure store its number of panels?

Still to do: come up with the weights for the various connections