Closed oliverglanz closed 4 years ago
Thanks Oliver!
Version 2017 cannot be changed, it is fixed version.
The conclusion is right: 2017 does not have the feature nametype on word nodes, and that makes things more difficult. But version c
has it as you wish.
See also https://nbviewer.jupyter.org/github/annotation/tutorials/blob/master/bhsa/cookbook/nametype.ipynb, written in response to Viktor Isaak earlier.
Thank you, Dirk, for the clarification. That is helpful. So the fact that querying (in "bhsa C") "lex nametype" yields less results than "word nametype" has to do with the fact that the object type "lex" overlaps with the object type "word".
But then there is still a bug in the "C" version of bhsa in SHEBANQ. If version "C" has the "nametype" feature connected to the object type "word" SHEBANQ must run a different version than "C" when I run the MQL query in "C":
I am told that "nametype" is not a feature of "word"...
If running the same query in TF on bhsa "C" I get what I should get when searching "C":
You spotted a discrepancy between the version c data as it is in github and as it is used by shebanq. I think I have added lex features to words in version c and published it on GitHub, without feeding it into the pipeline to Shebanq.
I could try and update the c version of shebanq. And it would also be good to prepare a new version, 2020, and add that in Github and Shebanq.
But I think the ETCBC and DANS should agree on that.
Sounds like a plan. Thank you.
I made the pipeline from the BHSA Github repo to SHEBANQ with a view that it be operated by the ETCBC. I still have that view.
Bug/Problem In the bhsa feature description it says that "nametype" is a feature of the objectype "lex". This looks to be the case indeed when checking SHEBANQ. Jer 1:1 shows the presence of three nametype values (2x "pers", 1x "topo):
When running a MQL query in SHEBANQ that looks for the value "topo" of the feature "nametype" of the object-type "lex" in Jer 1:1 it should find "Anathot". But it doesn't (https://shebanq.ancient-data.org/hebrew/query?version=2017&id=3479). Instead it finds only a 8 words in all of Jeremiah (there should be more then 500 topos in Jer). "Anathot" in Jer 1:1 is not found, even though it has received the value "topo". The same happens when looking for "pers" in Jer - only 20 are founds while there should be more than 1000.
A quick comparison with the bhsa TF app shows the same results:
However, in contrast to the feature description (https://etcbc.github.io/bhsa/features/nametype/) it seems that "nametype" is attached to the object-type word in the bhsa TF app where the accurate results can be retrieved:
The linking of "nametype" with the object-type "word" was not done in SHEBANQ, however:
Conclusion Only a very limited amount of "nametype" values are linked with the object-type "lex". This is true for both the bhsa TF app as well as SHEBANQ 2017. However, all "nametype" values are linked with the object-type "word" in the bhsa TF app.
Suggestion Change the official bhsa feature description and make "nametype" a feature of "word". This is already implemented in the bhsa TF app. The same should be done in SHEBANQ 2017.