When I ran the download script for geonames, I got two warnings from the rapper tool:
rapper: Error - URI file:1 - Using an element 'Feature' without a namespace is forbidden.
rapper: Error - - XML parser error: Opening and ending tag mismatch: gn:fs:long line 0 and wgs84_pos:long
Also the process seemed to be quite slow (after about an hour less than 5% of the final triple count was completed on a server with good capacity and few load).
I used a combination of sed and perl to combine the numerous individual RDF/XML-snippets to a single
big document, which was much faster and yielded no errors during subsequent conversion to ntriples with rapper:
When I ran the download script for geonames, I got two warnings from the
rapper
tool:Also the process seemed to be quite slow (after about an hour less than 5% of the final triple count was completed on a server with good capacity and few load).
I used a combination of sed and perl to combine the numerous individual RDF/XML-snippets to a single big document, which was much faster and yielded no errors during subsequent conversion to ntriples with rapper:
$1
- extracted geoames dump file all-geonames-rdf.txt$2
- destination for bis RDF/XML-fileperl
is usually available by default on all Linux boxes, but in principle it could also be replaced withawk
to reduce dependencies.