Open jbuerklin opened 4 years ago
Thank you very much for this pull request. I have played around with it quite a bit now. Overall, it works! I have found a few glitches, which were already in the previous code. See the respective (minor) issues.
Here are a few more requests, which are important for using this in practice. They should be relatively simple to implement:
Please fix the %CONNECTED_LINES% and rename it to %CONNECTED_TRIPLES%. When you consider the query body as an undirected graph (where each triple is a node and two triples are connected if they share a variable), then %CONNECTED_TRIPLES% is the connected component containing the current triple. This is not a feature request but a bug fix, see also #16
You currently support only one # IF # condition, namely "CURRENT_WORD". First, please revert it's meaning and rename it to CURRENT_WORD_EMPTY. Second, please add the following conditions. They are necessary to capture the complexity of our latest templates for Wikidata:
1.1 CURRENT_SUBJECT_VARIABLE ... true if CURRENT_SUBJECT is a variable 1.2 CURRENT_PREDICATE_VARIABLE ... true if CURRENT_PREDICATE is a variable 1.3 CONNECTED_LINES_EMPTY ... true if %CONNECTED_LINES% is empty, that is, if the connected component of the query graph containing the current triple consists only of that triple 1.4 CONNECTED_LINES_EMPTY_AND_CURRENT_SUBJECT_VARIABLE ... just the logical and of the two respective conditions; of course this would not be needed if arbitrary logical expressions with these conditions were possible, but I think that would be overkill for now. But if you feel that it's relatively easy, go ahead.
Third, for all of these, please also support the negated version. As a syntax, I would suggest !CURRENT_WORD_EMPTY, !CURRENT_SUBJECT_VARIABLE, !CURRENT_PREDICATE_VARIABLE, !CONNECTED_LINES_EMPTY . Given the next item, this is not strictly necessary, but would be very convenient. Again, this would not be needed if arbitrary logical expressions were possible, see the comment in the previous paragraph.
Please also add an # ELSE # construct, so that one can write something like
# IF CURRENT_WORD_EMPTY # ... # ELSE # ... # ENDIF #
Does nesting of these work?
Noch eine kleine Bitte:
@jbuerklin Not sure whether you got these comments, so trying again with an @ tag
I got all of them, thank you. I'm a bit busy this week but I will get to it tomorrow.
python manage.py migrate
!
works for all of them.
Adding support for AND
and OR
should be relatively simple now.# ELSE #
shouldn't be too hard as wellI should have written #18 here instead of opening a separate issue. It's in the same spirit as the new field for suggestions that are always shown and the checkbox should probably be above that field
Fixed #18 both in master and in this branch. Needs python manage.py migrate
Implemented AND and OR, where AND binds stronger than OR
Added # ELSE # construct and renamed CONNECTED_LINES_EMPTY to CONNECTED_TRIPLES_EMPTY
Also updated my first post with information about IF / ELSE
statements, conditions and logical expressions.
Thanks a lot, it's working great so far! Here is another minor bug:
In line 349 of backend/static/js/codemirror/modes/sparql/sparql-hint.js, prefixes in word (which is the %CURRENT_WORD% for the templates) are expanded to URL prefixes. However these URL prefixes typically contain dots. The typical use of %CURRENT_WORD% is in a prefix regex filter such as FILTER regex(?variable, "^%CURRENT_WORD"). When %CURRENT_WORD% contains dots, these dots have a special meaning in the regex (they match any character, not only a dot). The matches are usually the same, but it prevents the usage of binary search for prefix search in QLever.
Long story short: the . should be escaped. One simple fix that worked for me, was to replace line 349 in backend/static/js/codemirror/modes/sparql/sparql-hint.js as follows. But maybe you have a more principled fix. Note the double escaping of the \
Here is another minor issue:
The previous version of the UI did not make suggestions when there were no connected triples and no character has been typed yet. In particular, this is the situation at the very beginning of every SPARQL query body.
The current version shows suggestions in this situation, but they don't make too much sense.
I would suggest to add a checkbox to configure whether one wants suggestions in this situation (if yes, one can control which ones via the # IF ... # directives and with the new variables) or not.
Added escaping for %CURRENT_WORD%
The previous version of the UI did not make suggestions when there were no connected triples and no character has been typed yet. In particular, this is the situation at the very beginning of every SPARQL query body.
I'll look into that later
I would suggest to add a checkbox to configure whether one wants suggestions in this situation (if yes, one can control which ones via the # IF ... # directives and with the new variables) or not.
I added that in cc80048. Needs to be python manage.py migrate
- ed.
The new option is called Suggest subjects in empty lines
.
Should we merge this branch into master now, or are there any major concerns left?
We have played around with the templates a lot in the last months. The templates and the template substitution work very well, thank you! Before merging this into the master I have a question about the other fields in the backend configuration and the other modes:
When I select "2. Context-Insensitive suggestions", where do the suggestions from? In our current configuration, the SPARQL query for context-insensitive predicate or object suggestions is simple empty (or, rather, incomplete because there is only the LIMIT 40 OFFSET 0 that is always appended to the end). Maybe this is related to the answers to the next points.
What is the purpose of all the fields in the section "Showing names"? Are these still used for something? If no, the fields should be removed. If yes, there should not be just snippets of SPARQL queries here, but whole SPARQL queries, like in the section "Backend suggestions". And the "Need help?" text should be improved because I currently do not understand it.
The section "Preprocessing" with only field "Source Path" is no longer needed for anything, right? If that is the case, the section should be removed.
What are the entries in the field "Replace predicates in autocompletion context" supposed to do? I tried the replacements suggested unter "Need help?" but that didn't seem to habe any effect when I used one of those predicates (e.g. rdfs:label) in the query.
Another issue that is maybe related to this PR and maybe deserves a seperate change is the following:
We realized over the past months that we simply cannot make all context-sensitive suggestions fast enough. Most of them are very fast, but every once in a while one has to wait many seconds and sometimes even a minute or two for the suggestions to come. Sometimes, a suggestion query also fails altogether because it is simply too hard or requires too much memory. Any of these happen frequent enough (at least for Wikidata) that it is quite annoying when using the autocompletion for writing queries. It also gives the misleading feeling that the autocompletion concept is broken.
Since we have both context-sensitive and context-insensitive suggestions, this is actually relatively easy to fix. Namely, there should be a fourth mode, which launches both a context-sensitive and a context-insensitive query. When a certain amount of time has passed (this threshold should be configurable, a good default value is maybe 2 seconds) and the context-insensitive query hasn't finished, the results from the context-insensitive query should be taken instead (as soon as it is there, but these are usually very fast).
Changes Backend settings such that whole queries for autocompletions can be entered from start to end, instead of only defining some blocks that are then filled into predefined
SELECT { ... } GROUP BY
statements.This branch needs a
python manage.py migrate
if you're switching here from master.qleverui.sqlite3
unusable in the master branch.qleverui.sqlite3
before migratingImporting the example settings In order to get a quick first impression, I advise you to import our example settings by logging in to /admin/, clicking on
Backends
/Examples
/Prefixes
and importing the respective*-sample.csv
file. The example files already implement the new Backend settings.What has changed?
When editing a backend, the three settings
Suggest subjects clause
Suggest predicates clause
Suggest objects clause
now accept whole SPARQL queries as input. These queries will be executed when retrieving completions.
In order to make this work, we needed to introduce some kind of template syntax that would make it possible to factor in the user's current query context for each query.
To explain this syntax, we'll have a look at the
Suggest objects clause
as it is used in the sample settings:1.
%CURRENT_SUBJECT%
,%CURRENT_PREDICATE%
and%CURRENT_WORD%
The current line of the query the user is typing will be split into these placeholders.
Examples:
current line:
?c wdt:P31 coun[cursor]
%CURRENT_SUBJECT%
=?c
%CURRENT_PREDICATE%
=wdt:P31
%CURRENT_WORD%
=coun
current line:
?c inst[cursor]
%CURRENT_SUBJECT%
=?c
%CURRENT_PREDICATE%
=inst
%CURRENT_WORD%
=inst
current line:
?c[cursor]
%CURRENT_SUBJECT%
=?c
%CURRENT_PREDICATE%
=[not defined]
%CURRENT_WORD%
=?c
2.
%<CURRENT_WORD%
Same as
%CURRENT_WORD%
, but prepends a<
if%CURRENT_WORD%
doesn't start with<
or"
Can be helpful in combination with
HAVING
and KBs such as FreebaseEasy where you don't want to always type the<
in order for autocompletion to work.3.
# IF #
,# ELSE #
and# ENDIF #
Can be used to alter the completion query depending on the users current input.
Text inside an
# IF #
or# ELSE #
block will be ignored if the given condition is not satisified.Defining an
# ELSE #
block is optional.IF / ELSE / ENDIF
statements can be nested.4. Conditions Available conditions for
# IF #
statements are as follows:CURRENT_WORD_EMPTY
: true if the user hasn't startet typing a new wordCURRENT_SUBJECT_VARIABLE
: true if%CURRENT_SUBJECT%
is a variableCURRENT_PREDICATE_VARIABLE
: true if%CURRENT_PREDICATE%
is a variableCONNECTED_TRIPLES_EMPTY
: true if%CONNECTED_TRIPLES%
is emptyThese conditions can be combined into logical expressions of arbitraty length using
OR
- logical or (binds weakest)AND
- logical and (binds stronger than OR)!
- negation (binds stronger than AND)Example:
5.
%PREFIXES%
Inserts the prefix declarations the user has made.
6.
%CONNECTED_TRIPLES%
Inserts the lines of the user's query that are connected to
%CURRENT_WORD%
Further hints
LIMIT
andOFFSET
to limit the result. QLeverUI will do this by itself.SELECT
clause (like above:SELECT ?qleverui_entity [...]
). It does not need to be named?qleverui_entity
though.?qleverui_name
and?qleverui_altname
. Their position in theSELECT
clause does not matter.These last two restrictions will be changed in the future.
What has not changed All the settings in the
Showing names
category have not been changed. These are now only needed for the tooltips when hovering the mouse over an entity.It stands to question whether they can stay the way they are or need to be changed to be more customizable, too.
The
Alternative [...] name clause
settings are not needed anymore and will be removed later.