Open mgbeyer opened 9 years ago
I can't reproduce the problem. Please provide more information about the imported triples. The fragment identifier should be the last part of an uri (after filename, your leading slash looks a bit curious).
Thanks for the reply!
I don't know what you mean by "after filename"...what filename? Anyway, here's more detailed information about what we're trying to import (sorry this is a bit lengthy :))
<http://lod.gesis.org/thesoz/classification/0> <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://www.w3.org/2004/02/skos/core#Concept> .
<http://lod.gesis.org/thesoz/classification/0> <http://www.w3.org/2004/02/skos/core#inScheme> <http://lod.gesis.org/thesoz/> .
<http://lod.gesis.org/thesoz/classification/0> <http://www.w3.org/2004/02/skos/core#prefLabel> "Grundlagen der Sozialwissenschaften\u00A00"@de .
<http://lod.gesis.org/thesoz/classification/0> <http://www.w3.org/2004/02/skos/core#prefLabel> "Fundamentals of the Social Sciences\u00A00"@en .
<http://lod.gesis.org/thesoz/classification/0> <http://www.w3.org/2004/02/skos/core#prefLabel> "'fondements des sciences sociales\u00A00"@fr .
<http://lod.gesis.org/thesoz/classification/0> <http://www.w3.org/2004/02/skos/core#notation> "0"^^<http://www.w3.org/2001/XMLSchema#string> .
<http://lod.gesis.org/thesoz/classification/1> <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://www.w3.org/2004/02/skos/core#Concept> .
<http://lod.gesis.org/thesoz/classification/1> <http://www.w3.org/2004/02/skos/core#inScheme> <http://lod.gesis.org/thesoz/> .
<http://lod.gesis.org/thesoz/classification/1> <http://www.w3.org/2004/02/skos/core#prefLabel> "Grundlagen der Sozialwissenschaften\u00A00"@de .
<http://lod.gesis.org/thesoz/classification/1> <http://www.w3.org/2004/02/skos/core#prefLabel> "Fundamentals of the Social Sciences\u00A00"@en .
<http://lod.gesis.org/thesoz/classification/1> <http://www.w3.org/2004/02/skos/core#prefLabel> "'fondements des sciences sociales\u00A00"@fr .
<http://lod.gesis.org/thesoz/classification/1> <http://www.w3.org/2004/02/skos/core#notation> "0"^^<http://www.w3.org/2001/XMLSchema#string> .
We're using NAMESPACE='http://lod.gesis.org/thesoz/' as the default, so the remaining subjects will still contain a slash (like "classification/0"). I'm aware that if we expand the namespace to "http://lod.gesis.org/thesoz/classification/" we're facing subjects, starting with a number, which is also not approved by the importer for reasons unclear (see the validator method in the Origin class (/app/aides/origin.rb)). So basically we're talking about this code-fragment in the validator method of the Origin class:
# should not start with a number
valid = false if initial_value.match(/^\d.*/)
# should not contain special chars
valid = false if CGI.escape(initial_value) != initial_value
I, [2015-07-16T11:44:58.282643 #14596] INFO -- : Known namespaces:
I, [2015-07-16T11:44:58.282643 #14596] INFO -- : 1: skos: => http://www.w3.org/2004/02/skos/core#
I, [2015-07-16T11:44:58.282643 #14596] INFO -- : 2: skos: => http://www.w3.org/2008/05/skos#
I, [2015-07-16T11:44:58.282643 #14596] INFO -- : 3: rdf: => http://www.w3.org/1999/02/22-rdf-syntax-ns#
I, [2015-07-16T11:44:58.282643 #14596] INFO -- : 4: : => http://lod.gesis.org/thesoz/
I, [2015-07-16T11:44:58.282643 #14596] INFO -- : 5: rdfs: => http://www.w3.org/2000/01/rdf-schema#
I, [2015-07-16T11:44:58.282643 #14596] INFO -- : 6: owl: => http://www.w3.org/2002/07/owl#
I, [2015-07-16T11:44:58.282643 #14596] INFO -- : 7: dct: => http://purl.org/dc/terms/
I, [2015-07-16T11:44:58.282643 #14596] INFO -- : 8: foaf: => http://xmlns.com/foaf/spec/
I, [2015-07-16T11:44:58.282643 #14596] INFO -- : 9: void: => http://rdfs.org/ns/void#
I, [2015-07-16T11:44:58.282643 #14596] INFO -- : 10: iqvoc: => http://try.iqvoc.net/schema#
I, [2015-07-16T11:44:58.282643 #14596] INFO -- : Known first level classes:
I, [2015-07-16T11:44:58.282643 #14596] INFO -- : 1: skos:Concept => Concept::SKOS::Base
I, [2015-07-16T11:44:58.282643 #14596] INFO -- : 2: skos:Collection => Collection::SKOS::Unordered
I, [2015-07-16T11:44:58.282643 #14596] INFO -- : Known second level classes:
I, [2015-07-16T11:44:58.282643 #14596] INFO -- : 1: skos:prefLabel => Labeling::SKOS::PrefLabel
I, [2015-07-16T11:44:58.282643 #14596] INFO -- : 2: skos:altLabel => Labeling::SKOS::AltLabel
I, [2015-07-16T11:44:58.282643 #14596] INFO -- : 3: skos:changeNote => Note::SKOS::ChangeNote
I, [2015-07-16T11:44:58.282643 #14596] INFO -- : 4: skos:definition => Note::SKOS::Definition
I, [2015-07-16T11:44:58.282643 #14596] INFO -- : 5: skos:editorialNote => Note::SKOS::EditorialNote
I, [2015-07-16T11:44:58.282643 #14596] INFO -- : 6: skos:example => Note::SKOS::Example
I, [2015-07-16T11:44:58.282643 #14596] INFO -- : 7: skos:historyNote => Note::SKOS::HistoryNote
I, [2015-07-16T11:44:58.282643 #14596] INFO -- : 8: skos:scopeNote => Note::SKOS::ScopeNote
I, [2015-07-16T11:44:58.282643 #14596] INFO -- : 9: skos:related => Concept::Relation::SKOS::Related
I, [2015-07-16T11:44:58.282643 #14596] INFO -- : 10: skos:broader => Concept::Relation::SKOS::Broader::Mono
I, [2015-07-16T11:44:58.282643 #14596] INFO -- : 11: skos:narrower => Concept::Relation::SKOS::Narrower::Base
I, [2015-07-16T11:44:58.282643 #14596] INFO -- : 12: skos:closeMatch => Match::SKOS::CloseMatch
I, [2015-07-16T11:44:58.282643 #14596] INFO -- : 13: skos:exactMatch => Match::SKOS::ExactMatch
I, [2015-07-16T11:44:58.282643 #14596] INFO -- : 14: skos:relatedMatch => Match::SKOS::RelatedMatch
I, [2015-07-16T11:44:58.282643 #14596] INFO -- : 15: skos:broadMatch => Match::SKOS::BroadMatch
I, [2015-07-16T11:44:58.282643 #14596] INFO -- : 16: skos:narrowMatch => Match::SKOS::NarrowMatch
I, [2015-07-16T11:44:58.282643 #14596] INFO -- : 17: skos:notation => Notation::Base
I, [2015-07-16T11:44:58.282643 #14596] INFO -- : 18: skos:topConceptOf => Concept::SKOS::Scheme
I, [2015-07-16T11:44:58.282643 #14596] INFO -- : 19: skos:member => Collection::Member::SKOS::Base
I, [2015-07-16T11:44:58.282643 #14596] INFO -- : default namespace: 'http://lod.gesis.org/thesoz/'
I, [2015-07-16T11:44:58.282643 #14596] INFO -- : publish: 'true'
I, [2015-07-16T11:44:58.282643 #14596] INFO -- : SkosImporter: Importing triples...
W, [2015-07-16T11:44:58.292643 #14596] WARN -- : SkosImporter: Invalid origin. Skipping :classification/0 rdf:type skos:Concept
W, [2015-07-16T11:44:58.292643 #14596] WARN -- : SkosImporter: Invalid origin. Skipping :classification/0 skos:inScheme :
W, [2015-07-16T11:44:58.292643 #14596] WARN -- : SkosImporter: Invalid origin. Skipping :classification/0 skos:prefLabel "Grundlagen der Sozialwissenschaften\u00A00"@de
W, [2015-07-16T11:44:58.292643 #14596] WARN -- : SkosImporter: Invalid origin. Skipping :classification/0 skos:prefLabel "Fundamentals of the Social Sciences\u00A00"@en
W, [2015-07-16T11:44:58.292643 #14596] WARN -- : SkosImporter: Invalid origin. Skipping :classification/0 skos:prefLabel "'fondements des sciences sociales\u00A00"@fr
W, [2015-07-16T11:44:58.292643 #14596] WARN -- : SkosImporter: Invalid origin. Skipping :classification/0 skos:notation "0"^^<http://www.w3.org/2001/XMLSchema#string>
W, [2015-07-16T11:44:58.292643 #14596] WARN -- : SkosImporter: Invalid origin. Skipping :classification/1 rdf:type skos:Concept
W, [2015-07-16T11:44:58.292643 #14596] WARN -- : SkosImporter: Invalid origin. Skipping :classification/1 skos:inScheme :
W, [2015-07-16T11:44:58.302643 #14596] WARN -- : SkosImporter: Invalid origin. Skipping :classification/1 skos:prefLabel "Grundlagen der Sozialwissenschaften\u00A00"@de
W, [2015-07-16T11:44:58.302643 #14596] WARN -- : SkosImporter: Invalid origin. Skipping :classification/1 skos:prefLabel "Fundamentals of the Social Sciences\u00A00"@en
W, [2015-07-16T11:44:58.302643 #14596] WARN -- : SkosImporter: Invalid origin. Skipping :classification/1 skos:prefLabel "'fondements des sciences sociales\u00A00"@fr
W, [2015-07-16T11:44:58.302643 #14596] WARN -- : SkosImporter: Invalid origin. Skipping :classification/1 skos:notation "0"^^<http://www.w3.org/2001/XMLSchema#string>
I, [2015-07-16T11:44:58.302643 #14596] INFO -- : Computing 'forward' defined triples...
I, [2015-07-16T11:44:58.302643 #14596] INFO -- : Basic import done (took 0 seconds).
I, [2015-07-16T11:44:58.302643 #14596] INFO -- : Publishing 0 new subjects...
I, [2015-07-16T11:44:58.302643 #14596] INFO -- : Publishing of 0 subjects done (took 0 seconds). 0 are in draft state.
I, [2015-07-16T11:44:58.302643 #14596] INFO -- : Imported 0 published and 0 draft subjects in 0 seconds.
I, [2015-07-16T11:44:58.302643 #14596] INFO -- : First step took 0 seconds, publishing took 0 seconds.
As I said: lengthy as hell, sorry :-) But I guess it'll help to clarify the problem...
Thanks. I updated your comment with some formatting options. I'll check that.
BTW
...we're facing subjects, starting with a number, which is also not approved by the importer for reasons unclear...
Origins should not start with a number so that iQvoc is able to generate a valid rdf/xml serialization. See RDF syntax grammar for details.
If the subject part of an N-Triple line contains characters like slash (/) or hash (#) the importer will reject them (example: "WARN -- : SkosImporter: Invalid origin. Skipping :concept/#Abbreviations rdf:type skos:concept"). But characters like / or # are normal parts of an URI. For example one of our thesauri we'd like to import to iQvoc contains multiple levels beyond the context path set by the default namespace to distinguish between actual concepts and personal classes and properties (among others). Then if you strip the leading default namespace from a subject string (like the importer does) the remaining part of the URI still contains slashes and will be rejected by the importer.
Generally an URI should be granted to contain UTF-8 conform special characters to allow for regional character sets. So I wonder why the importer actively rejects characters beyond the minimal set of " a-zA-Z0-9_.-"? Was it a deliberate design decision with a sound purpose and I'm missing a point here? If you maybe could elaborate on that a little I would greatly appreciate it.