NLeSC / xtas

Distributed text analysis suite based on Celery
http://nlesc.github.io/xtas/
Other
94 stars 32 forks source link

Hendrike es integration. #72

Closed mariahendrike closed 9 years ago

mariahendrike commented 9 years ago

-- Query xtas results -- Return xtas results -- Fixing bug where indexes could not be deleted.

larsmans commented 9 years ago

All tests pass, very nice!

mariahendrike commented 9 years ago

Some things that should be kept in mind: -- I did a speed check and the new way of storing is insignificantly faster in retrieving. I guess one could redo this with a much bigger index, I assume one would see a significant difference there. -- Merging this into master means that the ES search indexes are not compatible with the retrieval functions. -- Deleting an index and creating a new index with the same name and the same ids will cause an error unless the workers are restarted. The reason is that CHECKED_MAPPINGS remembers index and ids in memory and es.py is circumvented and doesn't know about the deletion. Solution: write a wrapper for deleting indexes. -- When querying xtas results, one should query the data field, for example: query = {"match" : { "data" : {"query":"PERSON"}}} -- When an allowed version number of ES is greater or equal than 2.0.0, the tests are skipped