Closed xristy closed 6 years ago
Hi Chris, You must use ?L_TITLE?L_TITLELANG with (for instance) L_TITLELANG=@bo-x -ewts the @ parameter cannot be part of the template because it is considered as a Threat by the InjectionTracker. Marc
Le jeudi 01 février 2018 à 07:52 -0800, Chris Tomlinson a écrit :
is the convention you've used like:
?L_TITLE?L_TITLELANG
necessary or if I want to, can I write:
?L_TITLE@?L_TITLELANG
?
— You are receiving this because you were assigned. Reply to this email directly, view it on GitHub, or mute the thread.
{"api_version":"1.0","publisher":{"api_key":"05dde50f1d1a384dd78767c5 5493e4bb","name":"GitHub"},"entity":{"external_key":"github/BuddhistD igitalResourceCenter/lds- pdi","title":"BuddhistDigitalResourceCenter/lds- pdi","subtitle":"GitHub repository","main_image_url":"https://cloud.g ithubusercontent.com/assets/143418/17495839/a5054eac-5d88-11e6-95fc- 7290892c7bb5.png","avatar_image_url":"https://cloud.githubusercontent .com/assets/143418/15842166/7c72db34-2c0b-11e6-9aed- b52498112777.png","action":{"name":"Open in GitHub","url":"https://github.com/BuddhistDigitalResourceCenter/lds- pdi"}},"updates":{"snippets":[{"icon":"DESCRIPTION","message":"is the \"@\" convention in lang tag necessary? (#16)"}],"action":{"name":"View Issue","url":"https://github.com/BuddhistDigitalResourceCenter/lds- pdi/issues/16"}}}
I guess it would be good to clarify the injection threat. Links to discussion?
Maybe we could have ?T_TITLE
with the meaning that a valid language tag must be provided and permit writing
?L_TITLE@?T_TITLE
Rather than another issue, is the use of all caps for template variables part of the code or is the set of template variables determined by:
#QueryParams=
so that one might write:
#QueryParams=l_name,l_lang,i_lim
Could you tell me more about the issue coming from using ?L_TITLE?L_TITLELANG ?
Regarding @ viewed as an Injection threat, it is the way the ParameterizedSparqlString class works (it just refuses @ in the query (it might even be sparql itself since it uses the lang function - I have to check). (http://jena.apache.org/documentation/javadoc/arq/org/apache/jena/query/ParameterizedSparqlString.html).
Well, I see that currently we're not using the lang tags in a proper way. It seems to me that having the following mechanism would be more logical:
?ls_foo
is a lang string (and not just any literal like l_foo
)setLiteral(int index, String value, String lang)
(with the final lang argument)?ls_foo
we would have s_foo=xxx<_foo=yyy
where xxx
is the string and yyy
is the associated lang tagwdyt?
Also a less technical concern is that upper case letters are perceived as ugly, but that's more a subjective matter.
The user documentation for Parameterized SPARQL String is also useful.
It appears that the answer to my question about whether we could use lower-case versus upper case param names should be YES since those names are under our control and not required by ParameterizedSparqlString
. You're using the L_
, I_
and R_
as conventions to determine when to use setLiteral
and so on - which seems fine and using a case-insensitive test would be easy.
As Élie points out the handling of language tags may not be quite correct, although as I understand the PSS, it doesn't check validity of the string resulting from substitution until asQuery
is invoked. If this is the case then it seems that @
should not be an issue in the string!? The "@" is apparently not being interpreted until asQuery
(otherwise I don't see how current queries are working).
Élie's proposal is sound in terms of permitting some validation of language tags which seems beneficial.
setLiteral(String var, String value, String lang)
would presumably be the form to use, rather than the positional version.
I assume that the
#QueryParams=....
is used to validate that the params supplied in a request are the ones the template is designed to accept.
Right, #QueryParams is used for validation against templated request. Obviously there is no restriction regarding params case: I suggested upper case for a better readability of the templates : that matters quite a bit when you are designing query templates. I agree we are not correctly setting up languages in templates. I am going to implement your solution and I hope jena;text will be fine with it. I am not sure that setLiteral(?S_NAME, "whatever", LT_NAME) applied to text:query ( ?S_NAME), will produce the expected jena:text/lucene syntax. I must test that first
Regarding the BCP-47 thing I will use Locale (https://docs.oracle.com/javase/8/docs/api/java/util/Locale.html)
Just tested the produced query syntax and everything's fine.
setLiteral(?S_NAME, "whatever", LT_NAME)
will in fact work very well with jena:text. There is code in the interface to pay attention to rdf langString literals.
Regarding the BCP-47 there is a document that Élie has written that lays out the BUDA conventions for using BCP-47.
Obviously there is no restriction regarding params case.
So you will be changing the tests to allow arbitrary case?
I would prefer to keep everything uppercase (even though you can have lower case params names with uppercase prefix). The very practical reason to that is readability (in particular when I look for params into a big non pretty json string, either in the browser or in the console - add &jsonOut=true to the url of any request of the index page and you'll understand my point). Moreover, once all the demo stuff will be gone, replaced by BLMP, we will be the only ones to see these strings, most likely for debugging or new development.
About the test for bcp-47 I think it makes sense to keep things generic and use the Locale method, if users are looking for strings in Klingon they won't find any but that shouldn't bother the sparql query engine...
You may keep what you write in upper case of course; however, the template author should have the freedom to choose. The editing/authoring facility will still present a query with template variables to the author and it is a matter-of-taste as to whether the all uppercase. mixed case, or lower case work for the person authoring the template. The underlying tool should be flexible rather than rigid.
How will the Locale
class be used as far as user input of language tags? If tags are being validated then won't it be helpful to have at least a list of BUDA private-use-case extensions?
The Locale
class can validate if a lang tag is a real lang tag (and not some SPARQL injection or whatever), that's all lds should do IMO. If we have a list of lang tags inside the lds, that means redeploying each time we change or add a language, or having a complex mechanism for that, I don't think it's necessary... in the UI, of course, users will choose in a predefined list, but that would be on another level... wdyt?
I used the Locale.Builder class for BCP 47 validation in InjectionTracker. Regarding lang settings in query templates, the convention is as follows: A langage specific request will use L prefixed params for the string to be searched and LG prefix to indicate the associated lang.
Ex: L_NAME="chos rgyal" LG_NAME=bo-x-ewts
or simply L_NAME="'od zer" with no corresponding LG_NAME param if the language is unspecified.
is the convention you've used like:
necessary or if I want to, can I write:
?