vz-risk / Verum

Implementation of Context-Graph algorithms for graph enrichment and querying.
Apache License 2.0
24 stars 7 forks source link

Neo4j storage plugin did not handle utf8 bytecode #34

Closed gdbassett closed 9 years ago

gdbassett commented 9 years ago

Exception in thread Thread-3: Traceback (most recent call last): File "/usr/local/Cellar/python/2.7.9/Frameworks/Python.framework/Versions/2.7/lib/python2.7/threading.py", line 810, in *bootstrap_inner self.run() File "/usr/local/Cellar/python/2.7.9/Frameworks/Python.framework/Versions/2.7/lib/python2.7/threading.py", line 763, in run self.__target(_self.__args, _self.kwargs) File "/Users/v685573/Documents/Development/verum/minions/osint_bambenekconsulting_com.py", line 338, in minion self.app.store_graph(g) File "/Users/v685573/Documents/Development/verum/verum/app.py", line 508, in store_graph plugin.plugin_object.enrich(g) File "/Users/v685573/Documents/Development/verum/plugins/neo4j.py", line 197, in enrich for record_list in tx.commit(): File "/usr/local/lib/python2.7/site-packages/py2neo/cypher/core.py", line 294, in commit return self.post(self.commit or self.__begin_commit) File "/usr/local/lib/python2.7/site-packages/py2neo/cypher/core.py", line 236, in post rs = resource.post({"statements": self.statements}) File "/usr/local/lib/python2.7/site-packages/py2neo/core.py", line 281, in post response = self.base.post(body, headers, **kwargs) File "/usr/local/lib/python2.7/site-packages/py2neo/packages/httpstream/http.py", line 976, in post rq = Request("POST", self.uri, body, headers) File "/usr/local/lib/python2.7/site-packages/py2neo/packages/httpstream/http.py", line 382, in init self.body = json.dumps(body, cls=JSONEncoder, separators=",:") File "/usr/local/Cellar/python/2.7.9/Frameworks/Python.framework/Versions/2.7/lib/python2.7/json/__init.py", line 250, in dumps sort_keys=sort_keys, kw).encode(obj) File "/usr/local/Cellar/python/2.7.9/Frameworks/Python.framework/Versions/2.7/lib/python2.7/json/encoder.py", line 207, in encode chunks = self.iterencode(o, _one_shot=True) File "/usr/local/Cellar/python/2.7.9/Frameworks/Python.framework/Versions/2.7/lib/python2.7/json/encoder.py", line 270, in iterencode return _iterencode(o, 0) UnicodeDecodeError: 'utf8' codec can't decode byte 0xe7 in position 28: invalid continuation byte

gdbassett commented 9 years ago

Added -remove_non_ascii_from_graph(g) and removeNonAscii(s) to helper and called them in osint to avoid utf8 encoded strings. Added with commit [master 91edb04].