Closed tahmid02016 closed 1 year ago
Underlying error is a Sphinx error
query error: query is non-computable (single NOT operator)"
Error is due to how Sphinx search works. Sphinx needs to get a list of results from which it can then remove the items matching the NOT
@tahmid02016
See the Wiki.
How to find English sentences without "the", "a" or "an" https://en.wiki.tatoeba.org/articles/show/text-search#how-to-find-english-sentences-without-%22the%22,-%22a%22-o
This explains how you can do what you want to do
@ckjpn, I know the wiki page states a hack to bypass the error.
If you are determined to get as many results as possible, you can search for words that start with any letter of the alphabet, after putting a minus before each word that you do not want (though this query will take a long time): -the -a -an a|b|c|d|e|f|g|h|i|j|k|l|m|n|o|p|q|r|s|t|u|v|w|x|y|z
However, this search query takes a lot of time to load. Beside that, It is not a great solution for other language that has quite a number lf letter in their alphabet. For example, Bangla has 50 letters, Thai has 72 letters and Khmer has 74 letters in their alphabet.
This is a limitation of the search engine Manticore. The rationale for disabling such queries is that they are very resource-intensive. In newer versions of Manticore, there is an option to unable them nonetheless, but may I ask if it is worth it? Searching for -Tom
is certainly going to return an enormous amount of sentences, much more than the 1000 maximum browsable. Do you really need that much? What are you trying to achieve?
Note that Google Search doesn't allow the following either.
Closing this issue as will probably not be solved.
@tahmid02016 Sorry if my answer was a bit abrupt, I was just asking why because I wish I can help you solve this issue. But I cannot do so unless I know your intention, your original problem. Simply enabling "-Tom" searches is not much of an option because such query would take even more time than the mentioned hack and overload the server. But there may be other ways to solve the original problem that prompted you to open this issue. For example, if you are a developer wanting to work on a subset of the corpus that excludes all the Tom sentences, I can suggest another solution, to download the corpus as a CSV file and filter it.
To Reproduce Steps to reproduce the behavior:
-
prefix to get sentences excluding that word.Example: 1. Go to https://tatoeba.org/en 2. Search
-Tom
. 3. Get error message.Expected behavior When searched a word with
-
prefix, all the sentence excluding that word should be provided as search result. Example: When a user searches-Tom
, all sentences excluding the wordTom
should come as search result.