american-art / PUAM

Princeton University Art Museum
Other
4 stars 4 forks source link

apicongeography #24

Closed steads closed 7 years ago

steads commented 8 years ago

There is no data to validate the mapping with. Is it really E39 Actor or is it a mix of E21 Person and E74 Group. The mapping to E39 Actor is may be correct if there are both individuals (E21 Person) and groups (E74 Group). However this makes the mapping rather weak as the use of the superclass restricts the utility of the resulting data. P98 may not be used with E39. P100 may not be used with E39. The mapping looks like it is replicating the structure of TGN. If this is the case it makes little sense in a LOD environment. The use of P3 has note to link the lat-lon to the E47 Spatial Coordinates is unnecessary as the value of lat-lon is the instance of E47 Spatial Coordinates

VladimirAlexiev commented 8 years ago

There is now data.

12:30:52 WARN  riot                 :: [line: 428, col: 1 ] Bad IRI: <http://americanartcollaborative/puam/thesauri/place/Marylebone-[London]_none> Code: 0/ILLEGAL_CHARACTER in PATH: The character violates the grammar rules for URIs/IRIs.
12:30:52 WARN  riot                 :: [line: 428, col: 138] Bad IRI: <http://americanartcollaborative/puam/thesauri/place/Marylebone-[London]_none/appellation> Code: 0/ILLEGAL_CHARACTER in PATH: The character violates the grammar rules for URIs/IRIs.
12:30:52 WARN  riot                 :: [line: 2021, col: 1 ] Bad IRI: <http://americanartcollaborative/puam/thesauri/place/East-Frisia-[now-Germany]_Europe> Code: 
<http://americanartcollaborative/puam/thesauri/place/Prague_none_none_none_none>
<http://americanartcollaborative/puam/thesauri/place/San-Diego-CA_none>
        a                         crm:E53_Place ;
        crm:P2_has_type           <http://americanartcollaborative/puam/thesauri/placetype/Country> ;
<http://americanartcollaborative/puam/thesauri/place/Norman-OK_none>
        a                         crm:E53_Place ;
        crm:P2_has_type           <http://americanartcollaborative/puam/thesauri/placetype/Country> ;
<http://americanartcollaborative/puam/thesauri/place/San-Francisco_CA_none_USA_North-American>
        a                         crm:E53_Place ;
        crm:P2_has_type           <http://americanartcollaborative/puam/thesauri/placetype/City> ;
        crm:P89_falls_within      <http://americanartcollaborative/puam/thesauri/place/CA_none_USA_North-American> .

<http://americanartcollaborative/puam/thesauri/place/San-Francisco_CA_none_none_none>
        a                         crm:E53_Place ;
        crm:P2_has_type           <http://americanartcollaborative/puam/thesauri/placetype/City> ;
        crm:P89_falls_within      <http://americanartcollaborative/puam/thesauri/place/CA_none_none_none> .

<http://americanartcollaborative/puam/thesauri/place/San-Francisco_California_none_United-States_North-America>
        a                         crm:E53_Place ;
        crm:P2_has_type           <http://americanartcollaborative/puam/thesauri/placetype/City> ;
        crm:P89_falls_within      <http://americanartcollaborative/puam/thesauri/place/California_none_United-States_North-America> .

<http://americanartcollaborative/puam/thesauri/place/San-Francisco_California_none_United-States_none>
        a                         crm:E53_Place ;
        crm:P2_has_type           <http://americanartcollaborative/puam/thesauri/placetype/City> ;
        crm:P89_falls_within      <http://americanartcollaborative/puam/thesauri/place/California_none_United-States_none> .
<http://americanartcollaborative/puam/thesauri/place/Norht-America>
        a                         crm:E53_Place ;
        crm:P2_has_type           <http://americanartcollaborative/puam/thesauri/placetype/Continent> ;
        crm:P87_is_identified_by  [ a           crm:E48_Place_Name ;
                                    rdfs:label  "Norht America"
                                  ] .

(Note: even "North American" is arguably a misspelling, that's not the continent's name)

<http://americanartcollaborative/puam/thesauri/place/North-American>
        a                         crm:E53_Place ;
        crm:P2_has_type           <http://americanartcollaborative/puam/thesauri/placetype/Continent> ;
        crm:P87_is_identified_by  [ a           crm:E48_Place_Name ;
                                    rdfs:label  "North American"
                                  ] ;
        crm:P87_is_identified_by  [ a           crm:E48_Place_Name ;
                                    rdfs:label  "North American"
                                  ] ;
  ### 100s more                                  

There are so many mistakes that I think it's better to invest the time to coreference this to ULAN, rather than trying to repair it.

MOST importantly:

<http://americanartcollaborative/puam/con-geography/2805>
        a       crm:E53_Place .

<http://americanartcollaborative/puam/person-institution/12773>
        a                  crm:E39_Actor ;
        crm:P98i_was_born  <http://americanartcollaborative/puam/person-institution/12773/birth> .

<http://americanartcollaborative/puam/person-institution/12773/birth>
        a                     crm:E67_Birth ;
        crm:P7_took_place_at  <http://americanartcollaborative/puam/con-geography/2805> .
VladimirAlexiev commented 8 years ago

@caknoblock if you give me write access, I'll push apicongeography.ttl here. (Sorry, haven't used pull requests much...)

cathryng commented 8 years ago

is the problem mistakes in the data or mistakes in the mapping? i can deliver a cleaned json file if that's the issue here - also important to note that there are uris to geonames in this data set and that geocode holds 'placetype' data, as in place collected / place made / place depicted etc - lending different meaning to the terms themselves

VladimirAlexiev commented 8 years ago

3 of the bullets above describe problems i the data: "wrong place types" (eg San Diego and Norman are cities not countries); "coreferencing" and "misspellings"

cathryng commented 7 years ago

this should be corrected in the refreshed data