Open candlecao opened 2 months ago
I don't quite agree on this rendering because: (1) We can not guarantee that all of these are definitely in English. (2) It will cause burden to LLM2SPARQL, intensifying the inaccuracy. (3) We can use English as the default language so that there is no need to specify this; for other languages, we may supplement with tags such as @zh for Chinese @fr for French...
@fujinaga Hi, Ich, do you agree?
@fujinaga
There should always be a language tag in every string. We can always instruct ChatGPT to append the language tags in SPARQL queries.
Ok. Does this mean that I have to automatically detect the language of every string in my script?
No. For each database we import, we should know which language it's in. For now you can default always to @en. If we are storing chant text from CantusDB, that would be in Latin.
There are several codes that you can use for non-coded languages:
Type: script
Subtag: Zyyy
Description: Code for undetermined script
Added: 2005-10-16
%%
Type: script
Subtag: Zzzz
Description: Code for uncoded script
Added: 2005-10-16
%%
Type: language
Subtag: und
Description: Undetermined
Added: 2005-10-16
Scope: special
https://www.iana.org/assignments/language-subtag-registry/language-subtag-registry
Note: "und should not be used unless a language tag is required and language information is not available or cannot be determined. Omitting the language tag (where permitted) is preferred. This subtag may also be useful when matching language tags in certain situations. Where xml:lang="" is allowed by the markup, it is better to use that rather than und"
From a search for "und" here: https://r12a.github.io/app-subtags/
See: https://www.w3.org/International/questions/qa-no-language#undetermined
Thank you @ahankinson . Could you please give me some vivid examples plus explanation, which incorporate some tag in RDF
Could you please give me some vivid examples plus explanation, which incorporate some tag in RDF
No, because you can use Google as well as I can. :-)
For example, you can not query the session named "Hurley’s Irish Pub" by:
But you can make it by adding "@en":
?session wdt:P2561 "Hurley’s Irish Pub"@en .
The reason is due to the modification: