mozilla / elasticutils

[deprecated] A friendly chainable ElasticSearch interface for python
http://elasticutils.rtfd.org
BSD 3-Clause "New" or "Revised" License
243 stars 76 forks source link

Using Indexable.bulk_index on 3 items fails with timeout. #225

Closed jmizgajski closed 10 years ago

jmizgajski commented 10 years ago

consider this simple bit of code used inside a @classmethod from a class that subclasses Indexable and MappingType, inspired by your delayed indexing task.

documents = [cls.extract_document(o.id, o) for o in batch]
cls.bulk_index(documents, id_field='id', es=cls.get_es(),
                           index=cls.get_index())

where batch is a QuerySet that contains 3 objects (for now).

Unfortunately with both elasticsearch 0.9 and 1.1 we get, when trying to run this code.

ConnectionError(HTTPConnectionPool(host='localhost', port=9200): Read timed out. (read timeout=5)) caused by: ReadTimeoutError(HTTPConnectionPool(host='localhost', port=9200): Read timed out. (read timeout=5))

any ideas on what might be a reason for it?

willkg commented 10 years ago

What version of ElasticUtils are you using?

jmizgajski commented 10 years ago

My bad for using outdated elasticsearch-py, I was using

-e git+https://github.com/mozilla/elasticutils.git@98c34bee0e9508b1134d6c105cda48651443749a#egg=elasticutils-dev

for elasticutils

jmizgajski commented 10 years ago

I have miss-diagnosed the error, the problematic bit is really

            index = mapping_cls.get_index()
            doc_type = mapping_cls.get_mapping_type_name()
            index_body = {
                'settings': {
                    'number_of_shards': 3,
                    'number_of_replicas': 2
                },
                'mappings': {
                    doc_type: mapping_cls.get_mapping()
                }
            }
            print 'Creating index : %s with request body: %s ...' % (index,
                                                                     index_body)
            es.indices.create(index=index, body=index_body)

where mapping_cls is a subclass of MappingType and Indexable

willkg commented 10 years ago

Do you have Elasticsearch running on localhost:9200? What happens when you do this?:

curl http://localhost:9200/
jmizgajski commented 10 years ago

the problem is with the 'number_of_replicas': 2 option that I added carelessly, removing it fixes the problem. Sorry for bothering you. I will soon do a little PR with some utils for automatic indexing and isolated test cases, to make it up to you.

willkg commented 10 years ago

No worries!

One thing to know is that master tip is probably going to be released as 0.9 in the next week. If you see other problems, definitely let me know.

willkg commented 10 years ago

Closing this out since it seems like it's ok now.