Open zeginis opened 6 years ago
The system has no knowledge of language, it doesn't know what language is being used in the strings it receives.
This might be a bit tricky to resolve on a per-cell basis as there's not really any way to declare this within a csv file. I suppose we could add a language column for each string-valued column but that doesn't seem very satisfactory (as this is metadata not data).
The csvw:lang
property can be applied to a column, table or table-group. We could extend table2qb to accept optional csvw metadata about the input csv files. This would be a reasonably large change.
A simpler change might be to make this an application-level setting, being applied to all string literals sent to csv2rdf.
The problem is that the DataSet label (and some other labels) have a language tag, e.g. at https://github.com/Swirrl/table2qb/tree/master/examples/employment/ttl :
<http://statistics.gov.scot/data/employment> a <http://www.w3.org/ns/csvw#Table> ;
<http://www.w3.org/ns/csvw#url> <file:/var/folders/dr/0rl25prn4jqc59p92w22qj6w0000gp/T/component-specifications595063391289823660.csv> ;
dcterms:title "Employment"@en ;
rdfs:label "Employment"@en ;
<http://www.w3.org/ns/csvw#row> _:row183 .
<http://statistics.gov.scot/data/employment> a qb:DataSet .
This causes compatibility issues with CubiQL beacause we use an application level setting to define the language. So CubiQL requires:
Ah I see. Indeed the system does know about language insofar as the json-ld statements in the csvw metadata are concerned. Thanks for the clarification.
Some of these strings are set by the incoming csv or pipeline parameters (which could be subject to application-level config) but others - e.g. "Components Ontology" - are hard coded. That would complicate internationalisation (as you'd need to provide translations for all of the internal strings to do this comprehensively). We could read this from a static translations resource.
Perhaps we should just set everything to English for now, then return to a proper internationalisation later. Would that resolve your immediate issue or do you need to use a different language?
@zeginis - Is this still an issue after https://github.com/Swirrl/graphql-qb/issues/112 ? CubiQL falls back to using strings without a language if none is available with the configured language. Or does table2qb
generate strings with the @en
tag, which could differ from the configured language? In that case we could change the fallback behaviour to be configured language -> en -> no language
. Would that work for you?
@lkitching the language fall back configured language -> no language
is ok. I have reported a bug at CubiQL language fall back https://github.com/Swirrl/graphql-qb/issues/128
A language tag (e.g. '@en') does not exist at: