jimregan / mlode

Automatically exported from code.google.com/p/mlode
0 stars 0 forks source link

Some Dataset-Glottolog URIs contains spaces #20

Closed GoogleCodeExporter closed 9 years ago

GoogleCodeExporter commented 9 years ago
What steps will reproduce the problem?

1. First download the data set
wget http://www.glottolog.org/downloadarea/languoids.rdf.zip
unzip http://www.glottolog.org/downloadarea/languoids.rdf.zip 

2. Download and install VRP validation tool from: 
file:///home/sherif/GlottologLangdoc/vrp3.0/HowToUse.html
formore details see wiki: 
http://code.google.com/p/mlode/wiki/DebuggingLinkedData

3.Run the tool against the data set using:
java -jar vrp3.0.jar
there is a GUI where u can enter the i/o files, chick in "Triple" radio button.

What is the expected output? What do you see instead?

Error Example:

Semantic/Syntax error2012Not a valid URI!
Error called by: CUP$parser$actions.Invalid value for attribute 
http://www.w3.org/1999/02/22-rdf-syntax-ns#resource: 
http://en.wikipedia.org/wiki/Kuman (Russia) 
parser.URI_check
The URI http://en.wikipedia.org/wiki/Kuman (Russia)
Must be http://en.wikipedia.org/wiki/Kuman%20(Russia)
with the space replaced by %20

Error Description:

How many triples are affected? (if less than 3-5% of the whole data set,
please set priority to _low_)

there exist 2161 such Errors from 9346647 whole triples (0.023% of the whole 
triple affected with this error type).

The whole log file is attached. 

Original issue reported on code.google.com by mohamedd...@gmail.com on 23 Jul 2012 at 2:52

Attachments:

GoogleCodeExporter commented 9 years ago
replaced all spaces by underscores in xhtml and rdf. Dump is not updated yet.

Underscores seem to be more consistent with general wikipedia practice than %20

Original comment by sebastia...@googlemail.com on 23 Jul 2012 at 3:42

GoogleCodeExporter commented 9 years ago
Please get back to us, if there is a new dump, so we can verify , if the 
whitespaces disappeared. 

Original comment by kur...@googlemail.com on 25 Jul 2012 at 1:15

GoogleCodeExporter commented 9 years ago
please verify 

Original comment by sebastia...@googlemail.com on 23 Aug 2012 at 1:56

GoogleCodeExporter commented 9 years ago
validating on http://www.glottolog.org/downloadarea/languoids.rdf.zip indicate 
the same problem.

Is there new dump to validate on??

Original comment by mohamedd...@gmail.com on 23 Aug 2012 at 2:31

GoogleCodeExporter commented 9 years ago

Original comment by kur...@googlemail.com on 25 Aug 2012 at 7:44

GoogleCodeExporter commented 9 years ago
http://www.glottolog.org/downloadarea/languoids.n3.tgz

Original comment by sebastia...@googlemail.com on 27 Aug 2012 at 9:16

GoogleCodeExporter commented 9 years ago

Original comment by sebastia...@googlemail.com on 27 Aug 2012 at 11:14