SEMICeu / Core-Business-Vocabulary

This is the issue tracker for the maintenance of Core Business Vocabulary
16 stars 4 forks source link

Site vs Address #1

Closed VladimirAlexiev closed 2 years ago

VladimirAlexiev commented 6 years ago

@philarcher, @makxdekkers Not sure this is the proper place to post this, but hopefully someone reacts.

The https://github.com/euBusinessGraph project is producing tons of company data (@openc is one of the partners). We use Org, RegOrg, Locn and a custom EBG ontology. If you know other projects that are making large-scale org data, please let us know so we can coordinate.

The SEMIC vs W3C versions of basic company data have this difference:

I had a bit of trouble grasping the difference between Site and Address. Eg Ontotext's headquarters is at Polygraphia Office Center fl.4, 47A Tsarigradsko Shosse: is this a site or an address? So in the EBG data model we use a node that's both org:Site, locn:Address and has a self-link org:siteAddress:

https://raw.githubusercontent.com/euBusinessGraph/eubg-data/master/model/company.png

(You could also see the full euBusinessGraph Semantic Data Model).

Is there a better/more correct way to model this?

philarcher commented 6 years ago

Hi Vladimir,

I'll try and dig into my long term memory and answer these. Pls see inline below.

On 04/01/2018 13:55, Vladimir Alexiev wrote:

@philarcher, @makxdekkers Not sure this is the proper place to post this, but hopefully someone reacts.

The https://github.com/euBusinessGraph project is producing tons of company data (@openc is one of the partners). We use Org, RegOrg, Locn and a custom EBG ontology. If you know other projects that are making large-scale org data, please let us know so we can coordinate.

The obvious place to point you to - but I imagine you are already well aware of - is opencorporates.com. I'm pretty sure that the ODI is also working in this space and would certainly be interested in what you're doing.

The SEMIC vs W3C versions of basic company data have this difference:

I had a bit of trouble grasping the difference between Site and Address. Eg Ontotext's headquarters is at Polygraphia Office Center fl.4, 47A Tsarigradsko Shosse: is this a site or an address? So in the EBG data model we use a node that's both org:Site, locn:Address and has a self-link org:siteAddress:

The difference is that a site might have multiple addresses, or an address may not be associated with a physical location. The two are different concepts. If the name Tsarigradsko Shosse were changed (something local authorities just love to do), your location wouldn't change.

In some jurisdictions, UK is an example, the registered address is often not the head office, typically it's the address of their lawyer or founder's home address or whatever. It's a legal address that does not have to bear any relation to the trading premises.

https://raw.githubusercontent.com/euBusinessGraph/eubg-data/master/model/company.png

(You could also see the full euBusinessGraph Semantic Data Model).

Is there a better/more correct way to model this?

Oh there's always a better way, but the best way is typically the one that works best for you and others who may want to leverage your work in some way, and that can be completed on time ;-)

I'm no longer working in this field so I'm afraid I'm not up to date with latest developments. Although if you come across GLNs as location identifiers, or want to work with product IDs (GTINs/barcodes) let me know.

Cheers

Phil

-- Phil Archer http://philarcher.org +44 7887 767755 @philarcher1

makxdekkers commented 6 years ago

(Phil's response came in while I was typing up mine. Some of it already covered by Phil in his reaction) The Organization Ontology models an organisation as a location/address-less entity that has Sites that have addresses. In my view, Ontotext's headquarters would be a org:Site (e.g. with properties like label="HQ", surface=100m², energyLabel="A+" etc). That Site has an org:siteAddress. The address information could be modelled as properties of the Site -- like EBG does -- rather than as a relationship with an Address class, but in that case, you could not reuse the address information if there are two organisations on the same address, or if the registered address is also a place where the organisation does its business. In your diagram, you have the text "Same data as registered site/address" so you're duplicating data in the model which would be unnecessary if you linked a Site to an Address class. The Core Business Vocabulary indeed links the Address class directly to the Legal Entity as it treats the registered address as a special case, while the Organization Ontology treats the registered address as just one site among others.

VladimirAlexiev commented 6 years ago

@openc is a partner in the project and we know of their early attempts with RegOrg but we're mapping significantly more data.

a site might have multiple addresses

Do you mean only historic (changed)? Or can you give an example of several current addresses for a site?

If the name Tsarigradsko Shosse were changed (something local authorities just love to do)

Then the address (as an object) would remain the same. Eg consider an apartment building described in OpenStreetMaps, in the BG national address database, or as an INSPIRE location object. If the street name changes, that building will keep its identity. Else all features on that street would break, and leave hanging references in all client data that refers to them.

the registered address is often not the head office, typically it's the address of their lawyer or founder's home address or whatever.

Yes. So see https://github.com/SEMICeu/Core-Business-Vocabulary/issues/2

org:Site (e.g. with properties like label="HQ", surface=100m², energyLabel="A+" etc)

Yes, that's a different object from Address. But company registers don't have such info, so we don't need it.

The address information could be modelled as properties of the Site -- like EBG does -- rather than as a relationship with an Address class

I think it cannot, because the payload props (eg locn:thoroughfare) have locn:Address as domain. EBG conflates the two classes in one node (with the self-link as a sort of stupid wrinkle)

you could not reuse the address information if there are two organisations on the same address

If two Sites can point to the same Address, certainly two RegOrgs can point to the same Address. Whether an Address can contain several Sites is a mute question, unless you have real payload props of Sites (eg squareage) and the companies don't share the same physical space.

or if the registered address is also a place where the organisation does its business

If hasRegisteredSite and hasPrimarySite can point to the same Site, then the same is possible for Addresses. And we do that for the BG trade register (discover with MD5 hash when address value-objects are the same, and reuse them)

In your diagram, you have the text "Same data as registered site/address" so you're duplicating data

That illustrates a case where a company has several addresses. The model doesn't repeat the props of the other node only for brevity. It certainly doesn't advocate data duplication.

the best way is typically the one that works best for you and others who may want to leverage your work in some way

Thanks for the reassurance! :-)

makxdekkers commented 6 years ago

Resolution of this issue requires revision of the Core Vocabulary to align the Core Business Vocabulary with the Registered Organization Vocabulary as part of the next major semantic release.

EmidioStani commented 2 years ago

In the webinar of 09/11/2021, the working group agreed with aligning to Core Location Vocabulary, discussion can continue on https://github.com/SEMICeu/Core-Location-Vocabulary/issues/18