rjurney / Agile_Data_Code_2

Code for Agile Data Science 2.0, O'Reilly 2017, Second Edition
http://bit.ly/agile_data_science
MIT License
456 stars 306 forks source link

ch_02 elasticsearch EsHadoopInvalidRequest "final mapping would have more than 1 type" #68

Closed ppkn closed 5 years ago

ppkn commented 6 years ago

When I try to write data to ElasticSearch from PySpark I get this output: gist

This is on a ec2 instance created with ./ec2.sh

What could be going on here?

bravefoot commented 6 years ago

Recently, Elasticsearch was changed so that an index can only have one type. When you post to "localhost:9200/agile_data_science/test" you created an index named "agile_data_science" with a type of "test". Then, in the last line of pyspark_elasticsearch.py, the code attempts to create a new type called "executives" in the agile_data_science index, which is no longer allowed. There are a lot of possible solutions and I'll leave it to @rjurney to decide what the correct one is.

The easiest one for now is to replace "executives" on the last line of pyspark_elsticsearch.py with "test"

rjurney commented 6 years ago

Shit. That is a damned stupid change. I'll have to downgrade ElasticSearch.

rjurney commented 6 years ago

@dpipkin @bravefoot I downgraded Elasticsearch in both the Vagrant and EC2 scripts. Try again, sorry for the problems!

pjhinton commented 5 years ago

I noticed that bootstrap.sh on master still has Elasticsearch 6.1.2

https://github.com/rjurney/Agile_Data_Code_2/blob/374dadb95e25c5a60b6c3379c9bf727bf6b028f6/bootstrap.sh#L170

https://github.com/rjurney/Agile_Data_Code_2/blob/374dadb95e25c5a60b6c3379c9bf727bf6b028f6/bootstrap.sh#L174

I've added a public note to the Safari version of the book that provides notes this issue and the suggested workaround.

rjurney commented 5 years ago

Now installing elasticsearch 5.6.1 again, to resolve this bug.