Closed oliverglanz closed 2 years ago
c1 <10: c2 means that c1 is immediately before c2 with a leeway of 10 in both directions.
So if c2 has slot number 100, c1 could have 99 + or - 10, so anything between 89 and 109, including 100, which is c1.
So this is intentional. I remember earlier discussions about this point, I think with Cody, and yes, we could have defined it in another way, but that would cause other inconveniences.
Now the rest of your remarks:
First I run
verse book=Genesis chapter=20
c1:clause domain=N
<3: clause domain=Q
<50: c2:clause domain=N
c1 < c2
(a shorter version of your simplified query) and it gives me also 15 results (working on BHSA version c
)
Now let's see what happens if I run the full query against version c
:
verse book=Genesis chapter=20
c1:clause domain=N
phrase function=Pred
word lex=DBR[|QR>[|>MR[
phrase function=Subj
speakerA:word sp=subs|nmpr
phrase function=Cmpl
addresseeA:word sp=subs|nmpr
<3: c2:clause domain=Q
<50: c3:clause domain=N
phrase function=Pred
word lex=DBR[|QR>[|>MR[
phrase function=Subj
speakerB:word
phrase function=Cmpl
addresseeB:word
c1 < c3
speakerA .lex. speakerB
addresseeA .lex. addresseeB
I also get no results. It took me some while to understand the query and now I understand why there are no results:
The query states that clauses c1, c2, c3 are all in the same verse! But clearly, when you allow c3 to be 50 words further, you do not expect it still to be in the same verse!
If you postulate only c1 and c2 to be in the same verse, you have to write it like this
verse book=Genesis chapter=20
c1:clause domain=N
phrase function=Pred
word lex=DBR[|QR>[|>MR[
phrase function=Subj
speakerA:word sp=subs|nmpr
phrase function=Cmpl
addresseeA:word sp=subs|nmpr
<3: c2:clause domain=Q
c3:clause domain=N
phrase function=Pred
word lex=DBR[|QR>[|>MR[
phrase function=Subj
speakerB:word
phrase function=Cmpl
addresseeB:word
c2 <50: c3
c1 < c3
speakerA .lex. speakerB
addresseeA .lex. addresseeB
And that query gives me 1 result:
So, Oliver, I think the things you spotted are not bugs in TF after all.
But they are excellent examples of how writing queries requires quite a bit of teaching in order to avoid these pitfalls.
Dirk, thats what it was! A too narrow top-container (verse). My bad! Sorry to have spoiled your time on this one.
But to clarify the matter more:
c1 <10: c2
could man that c2 stands 10 monads before c1 (c2 could precede c1) then I always HAVE to ADD c1 < c2
if I only want the option to have c2 FOLLOW c1 within a range of 10 monads. Right?c2: clause domain=Q
and
c3:clause domain=N
by expressing:
[clause domain="Q"]*{1-5} [clause domain="N"]
This finds all cases in which the first clause (domain="Q") is repeated up to 5 times before the second clause (domain="N") appears. In TF it seems that this option is not available. Relations between elements can only be defined by a range of monads. Is that correct?
Yes, TF has not the Kleen star operation and its friends.
Yes, you are right, you have to add c1 < c2 to c1 <10: c2 if you want to make sure that c2 comes after c1.
It is tempting for me to change the definition into the meaning that the leeway always counts in the direction of the <, but it has disadvantages:
1) what should I do with the operators :k= and =k: ? Probably there the leeway should count in both directions. 2) what if a user wants the leeway in the other direction? I need a new operator for that, or something with a minus: c1 <-10: 2. 3) what if a user wants the leeway in both directions? I need something like c1 <-10,10 c2
With hindsight, these might have been better options. I could try to implement them, but it should be done in a backward compatible way.
Like
c1 <k: c2 means leeway of k in both directions (as before)
And the new ones:
c1 <+k: c2 means leeway of k in forward direction
c1 <-k: c2 means leeway of k in backward direction
c1 :+k> c2 means leeway of k in backward direction (because <: works in the other direction)
c1 :-k> c2 means leeway of k in forward direction (because <: works in the other direction)
c1 <-k+m: c2 means leeway of k in backward direction and leeway of m in forward direction
Likewise for
c1 :-k+m= c2
c1 =-k+m: c2
Can he get it from the unidirectional leeway? No, because there is no OR between relational conditions:
It will be no rocket science to implement this, but I have to be very careful. It affects parsing and semantics of queries. When I find the time, I'll definitely do this, if you think it is useful in this form.
Just FYI: it involves modifying a bunch of functions like this: https://github.com/annotation/text-fabric/blob/47a9e4bcb9ab307d975d52d6e7955f26231f0605/tf/search/relations.py#L545, where the k is the leeway. So instead of passing it a k, it gets a k and a h, one k for forward leeway and h for backward leeway.
-h+k => function(h, k) +k => function(0, k) -h => function(h, 0) k => function(k, k) {this is the old behaviour}
You see, I'm already anticipating coding it.
Dirk, you keep fascinating us with your listening to the community, seeking to understand their operations, and trying to respond to their needs. For my own processes, I am fine with adding further relational definition (c1 <20: c2 AND c1 < c2) to get what I want. Rather than refining the coding of relational definitions, I would love to have Kleen Star & Friends implemented. But, like you said elsewhere, the researcher might have to learn some hand-coding instead of demanding too much from TF's search function.
To be honest, I have not come round to implement this. It seems that TF has reached some optimum here between expressive power and coding effort. I'd rather leave it as it is for now.
Problem There is a bug in TF when it comes ot the definition of relations between elements. The results for the following query shows Gen 20:2 as a result because it takes the contents of c3 as being identical to the contents of c1. This is, however, illogical, since the relation between c3 and c2 is defined as c3 following c2 within 0-50 words. Thus, if c3 is to follow c2 and c2 is to follow c1 it cannot be that c3 is positioned in the clause sequence like c1:
In order to clarify the relation between c1 and c3 and overcome the identification of both clauses (although mistakenly) one could simply alter the query by explicitly defining the relation between c1 and c3 more precisely (c1 should be followed by c2, and c2 should be followed by c3, and c1 should be followed by c3):
This, however, yields no results.
Obviously, the search-engine is confused about the relations between the clauses. This confusion is not caused by the lines
Even without these lines the confusion remains.
Assumption I assume that the search-engines has somewhere a bug that does not allow the correct processing of complex substructures (c1 and c3 have elements with in them) with the explicit relation operators. Once the complex substructure is taken out, the clause relations are recognized correctly:
With MQL these complex relations can be queried without problem: https://shebanq.ancient-data.org/hebrew/query?version=2017&id=491