Closed gjost closed 6 years ago
Some error output:
$ ddrindex publish --hosts XXXXXX:9200 --recurse --force /var/www/media/ddr/ddr-one-7
...
2018-03-19 14:24:37.826610-07:00 | 2812/4216 POST ddr-one-7-26-21
Traceback (most recent call last):
File "/opt/ddr-local/venv/ddrlocal/bin/ddrindex", line 14, in <module>
load_entry_point('ddr-cmdln==0.9.4b0', 'console_scripts', 'ddrindex')()
File "/opt/ddr-local/venv/ddrlocal/local/lib/python2.7/site-packages/click-6.7-py2.7.egg/click/core.py", line 722, in __call__
return self.main(*args, **kwargs)
File "/opt/ddr-local/venv/ddrlocal/local/lib/python2.7/site-packages/click-6.7-py2.7.egg/click/core.py", line 697, in main
rv = self.invoke(ctx)
File "/opt/ddr-local/venv/ddrlocal/local/lib/python2.7/site-packages/click-6.7-py2.7.egg/click/core.py", line 1066, in invoke
return _process_result(sub_ctx.command.invoke(sub_ctx))
File "/opt/ddr-local/venv/ddrlocal/local/lib/python2.7/site-packages/click-6.7-py2.7.egg/click/core.py", line 895, in invoke
return ctx.invoke(self.callback, **ctx.params)
File "/opt/ddr-local/venv/ddrlocal/local/lib/python2.7/site-packages/click-6.7-py2.7.egg/click/core.py", line 535, in invoke
return callback(*args, **kwargs)
File "/opt/ddr-local/venv/ddrlocal/local/lib/python2.7/site-packages/ddr_cmdln-0.9.4b0-py2.7.egg/DDR/cli/ddrindex.py", line 287, in publish
status = docstore.Docstore(hosts, index).post_multi(path, recursive=recurse, force=force)
File "/opt/ddr-local/venv/ddrlocal/local/lib/python2.7/site-packages/ddr_cmdln-0.9.4b0-py2.7.egg/DDR/docstore.py", line 666, in post_multi
created = self.post(document, parents=parents, force=force)
File "/opt/ddr-local/venv/ddrlocal/local/lib/python2.7/site-packages/ddr_cmdln-0.9.4b0-py2.7.egg/DDR/docstore.py", line 560, in post
ES_Class = ELASTICSEARCH_CLASSES_BY_MODEL[document.identifier.model]
KeyError: 'segment'
Added 'segment' to ELASTICSEARCH_CLASSES_BY_MODEL
, pointing to Entity
. That fixes the particular error above, but leads to another error.
docstore.Docstore.post_multi
uses docstore.Docstore.post
to posting an object to Elasticsearch.
docstore.Docstore.post
uses repo_models.elastic.Entity.Meta.doc_type
as the document type, which is "entity".
However, when docstore.Docstore.post_multi
does a GET to see if it was saved successfully it uses object.identifier.model
which in this case as "segment".
This makes it look like the object was not written to Elasticsearch.
Using ELASTICSEARCH_CLASSES_BY_MODEL[oi.model]._doc_type.name
solved that problem. On to the next one!
...
2018-03-21 17:22:56.965632-07:00 | 3099/4216 POST ddr-one-7-30
2018-03-21 17:22:56.986403-07:00 | 3100/4216 POST ddr-one-7-30-8
Traceback (most recent call last):
File "/opt/ddr-local/venv/ddrlocal/bin/ddrindex", line 11, in <module>
load_entry_point('ddr-cmdln==0.9.4b0', 'console_scripts', 'ddrindex')()
File "/opt/ddr-local/venv/ddrlocal/local/lib/python2.7/site-packages/click-6.7-py2.7.egg/click/core.py",
...
File "/opt/ddr-local/venv/ddrlocal/local/lib/python2.7/site-packages/ddr_cmdln-0.9.4b0-py2.7.egg/DDR/cli/ddrindex.py", line 287, in publish
status = docstore.Docstore(hosts, index).post_multi(path, recursive=recurse, force=force)
File "/opt/ddr-local/venv/ddrlocal/local/lib/python2.7/site-packages/ddr_cmdln-0.9.4b0-py2.7.egg/DDR/docstore.py", line 655, in post_multi
document = oi.object()
File "/opt/ddr-local/venv/ddrlocal/local/lib/python2.7/site-packages/ddr_cmdln-0.9.4b0-py2.7.egg/DDR/identifier.py", line 1072, in object
return self.object_class(mappings).from_identifier(self)
File "/opt/ddr-local/venv/ddrlocal/local/lib/python2.7/site-packages/ddr_cmdln-0.9.4b0-py2.7.egg/DDR/models/__init__.py", line 1540, in from_identif$er
return from_json(Entity, identifier.path_abs('json'), identifier)
File "/opt/ddr-local/venv/ddrlocal/local/lib/python2.7/site-packages/ddr_cmdln-0.9.4b0-py2.7.egg/DDR/models/__init__.py", line 382, in from_json
document.load_json(fileio.read_text(json_path))
File "/opt/ddr-local/venv/ddrlocal/local/lib/python2.7/site-packages/ddr_cmdln-0.9.4b0-py2.7.egg/DDR/models/__init__.py", line 1639, in load_json
json_data = load_json(self, module, json_text)
File "/opt/ddr-local/venv/ddrlocal/local/lib/python2.7/site-packages/ddr_cmdln-0.9.4b0-py2.7.egg/DDR/models/__init__.py", line 283, in load_json
f.values()[0]
File "/opt/ddr-local/venv/ddrlocal/local/lib/python2.7/site-packages/ddr_cmdln-0.9.4b0-py2.7.egg/DDR/modules.py", line 112, in function
value = function(value)
File "/opt/ddr-local/ddr-defs/repo_models/segment.py", line 1080, in jsonload_topics
converters.text_to_bracketids(text, ['term','id'])
File "/opt/ddr-local/ddr-defs/repo_models/segment.py", line 1076, in TEMP_scrub_topicdata
item['term'] = TEMP_this.TOPICS[item['id']]
KeyError: u'205'
This turned out to be some code not tolerant enough of bad data.
Partially fixed in commit #3847388. Also requires fixes from ddr-defs
commits #abb6d89 and #e538992.
Gonna keep this open until it's tested and merged into master
all good on my end now!
UPDATE: Note that fix requires updating
ddr-defs
.