nheist / CaLiGraph

A Large Semantic Knowledge Graph from Wikipedia Categories and Listings
http://caligraph.org
GNU General Public License v3.0
24 stars 0 forks source link

Blank spaces in IRIs in caligraph-ontology.nt #12

Closed kathrinrin closed 1 year ago

kathrinrin commented 1 year ago

We would love to use the CaLiGraph data for research purposes over here in Amsterdam (VU / Triply), but unfortunately bumped into reusability issues due to invalid IRIs that contain blank spaces.

We don't have a complete overview, but we did look extensively into caligraph-ontology.nt downloaded and extracted from here.

riot --validate caligraph-ontology.nt results in: ERROR riot :: [line: 4510843, col: 77] Bad character in IRI (space): <http://caligraph.org/ontology/RestrictionHasValue_birthPlace_Trentino-Alto[space]...>

The complete subject of the triple in this line is: <http://caligraph.org/ontology/RestrictionHasValue_birthPlace_Trentino-Alto Adige/S%C3%BCdtirol>.

Blank spaces in IRIs seem to be the only issue in this file.

nheist commented 1 year ago

Hi Kathrin,

it's really cool to see that you consider using CaLiGraph for your research! Coincidentally, I discovered this same issue yesterday as well and I'm already working on a fix. The problem is that we reuse the entity URIs of DBpedia and they do not seem to care about spaces in URIs. I'll publish an updated version presumably next week. As a quick fix, you could simply replace the spaces in URIs e.g. with underscores. I'll let you know as soon as the new release is published. Don't hesitate to let me know if there is anything else that I can help you with :-)

Cheers, Nico

kathrinrin commented 1 year ago

Hi Nico,

Wow, thank you for the fast response!

I already fixed the file with this command: perl -pe 's/(<[^<>]*) /$1=~s| |%20|gr/ge' caligraph-ontology.nt > caligraph-ontology-cleaned.nt.

Best, Kathrin

nheist commented 1 year ago

Hi Hathrin,

thanks again for letting me know about this issue. It is fixed in the new version 3.1.1 of CaLiGraph.

All the best, Nico

kathrinrin commented 1 year ago

Wow great, thank you so much, Nico!