uwlib-cams / uwlswd_vocabs

RDF vocabularies created by University of Washington Libraries Cataloging and Metadata Services for local needs
https://uwlib-cams.github.io/uwlswd/
Creative Commons Zero v1.0 Universal
1 stars 0 forks source link

move linked_data_platforms to directory #22

Closed briesenberg07 closed 1 year ago

briesenberg07 commented 1 year ago

* See Run main.py to serialize data ** Only checked number of triples in the graph

briesenberg07 commented 1 year ago

20230919_main.py_error.txt

@cspayne sharing terminal error output I suspect the use of both / and \ in the filepath to directory for processing?

briesenberg07 commented 1 year ago

Oh wait, I'm looking at the following:

https://uwlib-cams.github.io/gitrepos_uwlswd/uwlswd_vocabs/linked_data_platforms\linked_data_platforms.rdf does not look like a valid URI, trying to serialize this will break.

That seems like a URL string which combines some of the path to the target directory in my computer's file structure with a base URL portion?

cspayne commented 1 year ago

Looking at this now!

cspayne commented 1 year ago

@briesenberg07 - will you try replacing line 36 in main.py (where uri_path is set) with:

uri_path = "https://uwlib-cams.github.io/" + file_path_noext.replace("../", "").replace("\\", "/")

@cspayne I'll try it now!

briesenberg07 commented 1 year ago

Hmmm, stymied! I did two things:

So I'm mystified by the new error message...

20230919_main.py_error_02.txt

...where it says:

https://uwlib-cams.github.io/gitrepos_uwlswd/uwlswd_vocabs/linked_data_platforms\linked_data_platforms.ttl does not look like a valid URI, trying to serialize this will break.

I don't know where gitrepos_uwlswd/ is coming from at this point, as that directory no longer exists on my machine...


* BMR directory structure now/following change

gitrepos/
   uwlswd/
   uwlswd_vocabs/
cspayne commented 1 year ago

@briesenberg07 Did this uri potentially end up in the rdf/xml during your first serialization attempt?

briesenberg07 commented 1 year ago

Thanks for the above Cypress--this had indeed happened. I ran the script successfully! The expected files were output in the expected place!

But checking dct:hasFormat values:

Input (excerpt):

...
    <void:feature rdf:resource="http://www.w3.org/ns/formats/Turtle"/>
    <dct:hasFormat rdf:resource="https://uwlib-cams.github.io/uwlswd_vocabs/linked_data_platforms.nt"/>
    <dct:hasFormat rdf:resource="https://uwlib-cams.github.io/uwlswd_vocabs/linked_data_platforms.ttl"/>
    <dct:hasFormat rdf:resource="https://uwlib-cams.github.io/uwlswd_vocabs/linked_data_platforms.jsonld"/>
    <dct:hasFormat rdf:resource="https://uwlib-cams.github.io/uwlswd_vocabs/linked_data_platforms.html"/>
    <ldproc:containsEntity rdf:resource="https://doi.org/10.6069/uwlib.55.b.1#Platform"/>
...

Output (excerpt):

...
    <void:feature rdf:resource="http://www.w3.org/ns/formats/Turtle"/>
    <dcterms:hasFormat rdf:resource="https://uwlib-cams.github.io/uwlswd_vocabs/linked_data_platforms.nt"/>
    <dcterms:hasFormat rdf:resource="https://uwlib-cams.github.io/uwlswd_vocabs/linked_data_platforms.ttl"/>
    <dcterms:hasFormat rdf:resource="https://uwlib-cams.github.io/uwlswd_vocabs/linked_data_platforms.jsonld"/>
    <dcterms:hasFormat rdf:resource="https://uwlib-cams.github.io/uwlswd_vocabs/linked_data_platforms.html"/>
    <dcterms:hasFormat rdf:resource="https://uwlib-cams.github.io/uwlswd_vocabs/linked_data_platforms/linked_data_platforms.ttl"/>
    <dcterms:hasFormat rdf:resource="https://uwlib-cams.github.io/uwlswd_vocabs/linked_data_platforms/linked_data_platforms.nt"/>
    <dcterms:hasFormat rdf:resource="https://uwlib-cams.github.io/uwlswd_vocabs/linked_data_platforms/linked_data_platforms.jsonld"/>
    <dcterms:hasFormat rdf:resource="https://uwlib-cams.github.io/uwlswd_vocabs/linked_data_platforms/linked_data_platforms.html"/>
    <ldproc:containsEntity rdf:resource="https://doi.org/10.6069/uwlib.55.b.1#Platform"/>
...
cspayne commented 1 year ago

Yay!

The repeat dct:hasFormat output is because of the location change - if you remove or update the dct:hasFormat triples before serialization, then the script will add the correct ones.

It is not set to remove any dct:hasFormat triples that are already in the rdf/xml so that in circumstances where a resource does have additional formats, those are not lost. This does mean that when the uri is changing, there is the additional step of removing the old dct:hasFormat triples.

briesenberg07 commented 1 year ago

@cspayne did your DOI URL update take effect yet? I checked this vocab this morning and DOI dereferenced to new URL (after making DataCite Fabrica updates on Tuesday). So, it takes... something less than 2 days to take effect?

cspayne commented 1 year ago

@cspayne did your DOI URL update take effect yet? I checked this vocab this morning and DOI dereferenced to new URL (after making DataCite Fabrica updates on Tuesday). So, it takes... something less than 2 days to take effect?

Yes! I had to clear the cache in my browser and since then it's been pretty instantaneous.