ChristopherRabotin / bungiesearch

UNMAINTAINED CODE -- Elasticsearch-dsl-py django wrapper with mapping generator
BSD 3-Clause "New" or "Revised" License
68 stars 20 forks source link

error [_id] is defined twice #158

Open traboukos opened 7 years ago

traboukos commented 7 years ago

Hello I have been struggling all day with this error while trying to run ./manage.py search_index --create

I am using the following packages elasticsearch server Version: 5.0.2, Build: f6b4951/2016-11-24T10:07:18.101Z, JVM: 1.8.0_65 elasticsearch 5.0.1 elasticsearch-dsl 5.0.0 bungiesearch 1.3.1 # tried this with master also same error

This is my model

class ChatIndex(ModelIndex):
    """Search index for Chat model"""   

    class Meta(object):
        """Meta data"""
        model = Chat
        default = True

and those are the mappings generated for the model

{'properties': {
  '_id': {'type': 'integer'},
  'created_at': {'type': 'date'},
  u'id': {'type': 'integer'},
  'text': {'analyzer': 'snowball', 'type': 'string'},
  'timestamp': {'type': 'date'},
  'updated_at': {'type': 'date'}}
}

following is the error displayed in the console while running the command

Traceback (most recent call last):
  File "./manage.py", line 22, in <module>
    execute_from_command_line(sys.argv)
  File "/Users/dstefanidis/.virtualenvs/bingo/lib/python2.7/site-packages/django/core/management/__init__.py", line 353, in execute_from_command_line
    utility.execute()
  File "/Users/dstefanidis/.virtualenvs/bingo/lib/python2.7/site-packages/django/core/management/__init__.py", line 345, in execute
    self.fetch_command(subcommand).run_from_argv(self.argv)
  File "/Users/dstefanidis/.virtualenvs/bingo/lib/python2.7/site-packages/django/core/management/base.py", line 348, in run_from_argv
    self.execute(*args, **cmd_options)
  File "/Users/dstefanidis/.virtualenvs/bingo/lib/python2.7/site-packages/django/core/management/base.py", line 399, in execute
    output = self.handle(*args, **options)
  File "/Users/dstefanidis/.virtualenvs/bingo/lib/python2.7/site-packages/bungiesearch/management/commands/search_index.py", line 138, in handle
    es.indices.create(index=index, body={'mappings': mapping, 'settings': {'analysis': analysis}})
  File "/Users/dstefanidis/.virtualenvs/bingo/lib/python2.7/site-packages/elasticsearch/client/utils.py", line 71, in _wrapped
    return func(*args, params=params, **kwargs)
  File "/Users/dstefanidis/.virtualenvs/bingo/lib/python2.7/site-packages/elasticsearch/client/indices.py", line 107, in create
    params=params, body=body)
  File "/Users/dstefanidis/.virtualenvs/bingo/lib/python2.7/site-packages/elasticsearch/transport.py", line 327, in perform_request
    status, headers, data = connection.perform_request(method, url, params, body, ignore=ignore, timeout=timeout)
  File "/Users/dstefanidis/.virtualenvs/bingo/lib/python2.7/site-packages/elasticsearch/connection/http_urllib3.py", line 124, in perform_request
    self._raise_error(response.status, raw_data)
  File "/Users/dstefanidis/.virtualenvs/bingo/lib/python2.7/site-packages/elasticsearch/connection/base.py", line 122, in _raise_error
    raise HTTP_EXCEPTIONS.get(status_code, TransportError)(status_code, error_message, additional_info)
elasticsearch.exceptions.RequestError: TransportError(400, u'mapper_parsing_exception', u'Failed to parse mapping [Chat]: Field [_id] is defined twice in [Chat]')
ChristopherRabotin commented 7 years ago

Is this a new index, or does it already have data?

On Wed, Dec 7, 2016, 07:49 Dimitris Stefanidis notifications@github.com wrote:

Hello I have been struggling all day with this error while trying to run ./manage.py search_index --create

I am using the following packages elasticsearch server Version: 5.0.2, Build: f6b4951/2016-11-24T10:07:18.101Z, JVM: 1.8.0_65 elasticsearch 5.0.1 elasticsearch-dsl 5.0.0 bungiesearch 1.3.1 # tried this with master also same error

This is my model

class ChatIndex(ModelIndex): """Search index for Chat model"""

class Meta(object):
    """Meta data"""
    model = Chat
    default = True

and those are the mappings generated for the model

{'properties': { '_id': {'type': 'integer'}, 'created_at': {'type': 'date'}, u'id': {'type': 'integer'}, 'text': {'analyzer': 'snowball', 'type': 'string'}, 'timestamp': {'type': 'date'}, 'updated_at': {'type': 'date'}} }

following is the error displayed in the console while running the command

Traceback (most recent call last): File "./manage.py", line 22, in execute_from_command_line(sys.argv) File "/Users/dstefanidis/.virtualenvs/bingo/lib/python2.7/site-packages/django/core/management/init.py", line 353, in execute_from_command_line utility.execute() File "/Users/dstefanidis/.virtualenvs/bingo/lib/python2.7/site-packages/django/core/management/init.py", line 345, in execute self.fetch_command(subcommand).run_from_argv(self.argv) File "/Users/dstefanidis/.virtualenvs/bingo/lib/python2.7/site-packages/django/core/management/base.py", line 348, in run_from_argv self.execute(*args, cmd_options) File "/Users/dstefanidis/.virtualenvs/bingo/lib/python2.7/site-packages/django/core/management/base.py", line 399, in execute output = self.handle(*args, *options) File "/Users/dstefanidis/.virtualenvs/bingo/lib/python2.7/site-packages/bungiesearch/management/commands/search_index.py", line 138, in handle es.indices.create(index=index, body={'mappings': mapping, 'settings': {'analysis': analysis}}) File "/Users/dstefanidis/.virtualenvs/bingo/lib/python2.7/site-packages/elasticsearch/client/utils.py", line 71, in _wrapped return func(args, params=params, kwargs) File "/Users/dstefanidis/.virtualenvs/bingo/lib/python2.7/site-packages/elasticsearch/client/indices.py", line 107, in create params=params, body=body) File "/Users/dstefanidis/.virtualenvs/bingo/lib/python2.7/site-packages/elasticsearch/transport.py", line 327, in perform_request status, headers, data = connection.perform_request(method, url, params, body, ignore=ignore, timeout=timeout) File "/Users/dstefanidis/.virtualenvs/bingo/lib/python2.7/site-packages/elasticsearch/connection/http_urllib3.py", line 124, in perform_request self._raise_error(response.status, raw_data) File "/Users/dstefanidis/.virtualenvs/bingo/lib/python2.7/site-packages/elasticsearch/connection/base.py", line 122, in _raise_error raise HTTP_EXCEPTIONS.get(status_code, TransportError)(status_code, error_message, additional_info) elasticsearch.exceptions.RequestError: TransportError(400, u'mapper_parsing_exception', u'Failed to parse mapping [Chat]: Field [_id] is defined twice in [Chat]')

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/ChristopherRabotin/bungiesearch/issues/158, or mute the thread https://github.com/notifications/unsubscribe-auth/AEma6LikTsL-A8GszMScEmRJI229AUF1ks5rFsdegaJpZM4LGrQy .

traboukos commented 7 years ago

@ChristopherRabotin this is a new index

ChristopherRabotin commented 7 years ago

Have you solved the issue? Sorry for not getting back to you earlier.

traboukos commented 7 years ago

Hey thanks for the answer. Unfortunately after being plagued by this error plus various other errors I had to abandon this library and chose to use dsl-py directly. This has sped up my progress and gave me more control over the queries which is something I needed.

I am pretty sure I could do all the things I need with bungiesearch also but the documentation needs some work. For example there is not a single reference to the word aggregation neither in the docs nor the submitted issues. Faceting and aggregating data is one of the main reason I always use a search engine. I did not manage to find any info or example about how to do this. Same for multi valued fields like arrays. I could not easily find info about how to index them.

traboukos commented 7 years ago

@ChristopherRabotin I would like to revisit bungiesearch now that I made some progress using dsl-py directly. Could you answer my questions above ? Is it possible to use Keyword fields and perform aggregations ? Thanks in advance.

ChristopherRabotin commented 7 years ago

The Keyword field is not supported yet, someone would have to work on a PR for this. Aggregation at the ES level don't work either: I think there is an old issue to try to make them work, but the core functionality of bungie search relies on the Search component of dsl-py. If I recall correctly, the result of a Search can't just be fed into the dsl-py Aggregator.

On Tue, Dec 20, 2016, 07:10 Dimitris Stefanidis notifications@github.com wrote:

@ChristopherRabotin https://github.com/ChristopherRabotin I would like to revisit bungiesearch now that I made some progress using dsl-py directly. Could you answer my questions above ? Is it possible to use Keyword fields https://github.com/elastic/elasticsearch-dsl-py/blob/master/elasticsearch_dsl/field.py#L243 and perform aggregations ? Thanks in advance.

— You are receiving this because you were mentioned.

Reply to this email directly, view it on GitHub https://github.com/ChristopherRabotin/bungiesearch/issues/158#issuecomment-268251989, or mute the thread https://github.com/notifications/unsubscribe-auth/AEma6LM3tEmGU3-VUcgQWSKjdlomxSntks5rJ-HBgaJpZM4LGrQy .

umbrae commented 7 years ago

I ran into this same issue FWIW. Some pdbing brought me to my mapping that looked like this:

{
    "mappings": {
        "Profile": {
            "properties": {
                "id": {
                    "type": "integer"
                },
                // ...other fields here...
                "_id": {
                    "type": "integer"
                }
            }
        }
    },
    "settings": {
        "analysis": {}
    }
}

I only just started using ES within the last day, so I have no idea if that's accurate, but removing _id from the create looked to work well, and the reindex seemed to genuinely reindex as expected.

Possibly ES somewhere is trying to be smart, and if it sees both an id key and an _id key is considering them a conflict?

ChristopherRabotin commented 7 years ago

Yeah, that's possible I guess. What version of ES ae you running?

umbrae commented 7 years ago

2.4.4

ChristopherRabotin commented 7 years ago

Interesting. Okay. Thanks for the info. I don't really have a solution yet, and don't know when I will. I don't use bungiesearch anymore because I don't have any project that uses elasticsearch... But if I get around to it, I'll attempt to fix this. Sorry.

umbrae commented 7 years ago

No problem, honestly I switched away from it and started using elasticsearch-dsl which seems to be going well. Less creature comforts than bungiesearch but I can deal. Thanks for putting this out there!

umbrae commented 7 years ago

(I will say it might be worthwhile to update your readme and state it's not actively being maintained, I bet you'll feel less stress from people posting issues if you do that as well.)

ChristopherRabotin commented 7 years ago

Yes, that's a good idea. I will do that.

(I won't hide that I would have loved seeing bungiesearch used at reddit... Haha, next time! If you guys need some go development, especially astrodynamic simulations, let me know!!)

umbrae commented 7 years ago

Haha well I will say that this was for a side project, but we are definitely hiring: https://boards.greenhouse.io/reddit

We have some go but it's mostly our statsd clone and some ad serving stuff.