jimregan / mlode

Automatically exported from code.google.com/p/mlode
0 stars 0 forks source link

Ontos data set not crawlable via Linked Data #22

Open GoogleCodeExporter opened 9 years ago

GoogleCodeExporter commented 9 years ago
See the script in the repository, which crawls linked data:
http://code.google.com/p/mlode/wiki/DebuggingLinkedData#linked_data_crawler_test

./testLinkedData.sh  
"http://www.ontosearch.com/2008/01/identification#EID-3b79064eeb9930abe4da398caf
c870ef" "http://www.ontosearch.com/2008/01/identification"

retrieves 3 triples only:
_:genid1 <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> 
<http://88.198.67.99:8080/dereferencer/2008/01/rdf#identification> .
_:genid1 <http://purl.org/dc/elements/1.1/publisher> "Ontos AG" .
_:genid1 <http://purl.org/dc/terms/license> 
"http://creativecommons.org/licenses/by-nc/3.0/" .

./testLinkedData.sh  
"http://www.ontosearch.com/2008/01/identification%23EID-3b79064eeb9930abe4da398c
afc870ef" "http://www.ontosearch.com/2008/01/identification"

the first retrieval works now, but crawling is impossible, because of the #

**************************
retrieving 
http://www.ontosearch.com/2008/01/identification#EID-32b509444fa7c3dd76ca2da5c22
3bfd6
**************************
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
  0     0    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0
  0     0    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0
100   797  100   797    0     0   5641      0 --:--:-- --:--:-- --:--:--  5641
rapper: Parsing URI file:///tmp/test/mlode/unix_debugging_scripts/test.rdf with 
parser guess
rapper: Serializing with serializer ntriples
rapper: Guessed parser name 'rdfxml'
rapper: Parsing returned 3 triples

recommended fix:
curl -H "Accept: XXX" 
"http://www.ontosearch.com/2008/01/identification%23EID-3b79064eeb9930abe4da398c
afc870ef"

If XXX=application/rdf+xml redirect with 303 to 
"http://www.ontosearch.com/2008/01/rdf/EID-3b79064eeb9930abe4da398cafc870ef"
if XXX=text/html redirect with 303 to 
"http://www.ontosearch.com/2008/01/identification%23EID-3b79064eeb9930abe4da398c
afc870ef"

Original issue reported on code.google.com by kur...@googlemail.com on 26 Jul 2012 at 10:00

Attachments:

GoogleCodeExporter commented 9 years ago
What is the current status of this issue? please update the status.

Original comment by mohamedd...@gmail.com on 18 Sep 2012 at 1:55

GoogleCodeExporter commented 9 years ago
Fixed.

Original comment by christia...@ontos.com on 15 Mar 2013 at 10:16