ArchivesPortalEuropeFoundation / Topic-Detection

Using machine learning approaches for automatic topic detection in a multilingual environment
6 stars 0 forks source link

Internal Server Error when using ANDNOT boolean operator #62

Open kstamatis opened 2 years ago

kstamatis commented 2 years ago

@fedenanni While checking why the following query Keynes ANDNOT "austérité", fr, concept, Boolean Search is not working, I found in the code (file: nlp.py, method: build_query_vector) the comment: # handling only AND for now.

In the aforementioned query, one of the two vectors (I do not actually know what it means :) ) is None and thus the response of the method is None leading to a message "There is an issue with your query: maybe this is not a boolean search?".

This makes me think that the boolean operators work only if both parts of the boolean query are found in the corpus (not sure about that).

fedenanni commented 2 years ago

Regarding this, I am getting different errors depending on small variations on the query.

First: Screenshot 2022-02-27 at 11 25 44 Second: Screenshot 2022-02-27 at 11 26 04

fedenanni commented 2 years ago

Interestingly, Keynes and austérité alone go through without a problem - so something happens in the boolean part of the code Screenshot 2022-02-27 at 11 27 41 .

fedenanni commented 2 years ago

Ok, it was a small issue with the table I was generating for AND and ANDNOT. One column was missing from the final output. It is fixed in 4622b52, it remains the error message when using the quotation marks that I'll explore next