I've encountered at feed that triggers the following error:
/opt/minemeld/engine/0.9.44/local/lib/python2.7/site-packages/minemeld/ft/basepoller.py:510: UnicodeWarning: Unicode unequal comparison failed to convert both arguments to Unicode - interpreting them as being unequal
if oa.get(k, None) != na[k]:
2018-01-17T14:37:00 (305)basepoller._poll ERROR: Exception in polling loop for malwaredomainlist_com: Overlong 2 byte UTF-8 sequence detected when encoding string
Traceback (most recent call last):
File "/opt/minemeld/engine/0.9.44/local/lib/python2.7/site-packages/minemeld/ft/basepoller.py", line 721, in _poll
performed = self._polling_loop()
File "/opt/minemeld/engine/0.9.44/local/lib/python2.7/site-packages/minemeld/ft/basepoller.py", line 648, in _polling_loop
self.table.put(indicator, v)
File "/opt/minemeld/engine/0.9.44/local/lib/python2.7/site-packages/minemeld/ft/basepoller.py", line 113, in put
return self.table.put(indicator, value)
File "/opt/minemeld/engine/0.9.44/local/lib/python2.7/site-packages/minemeld/ft/table.py", line 318, in put
batch.put(ikey, struct.pack(">Q", cversion)+ujson.dumps(value))
OverflowError: Overlong 2 byte UTF-8 sequence detected when encoding string
A simple, approximate reproduction is as follows:
#!/usr/bin/python
import ujson as json
l = '\xc1'
print(json.dumps(l))
Which is solved by recoding to UTF8, which appears to be hardcoded in Minemeld:
#!/usr/bin/python
import ujson as json
l = '\xc1'
print(json.dumps(l.decode('latin_1').encode('utf_8')))
I suggest the basepoller is modified to support other encodings for feeds, in order to recode to UTF8, which appears to be used withing minemeld.
I've encountered at feed that triggers the following error:
A simple, approximate reproduction is as follows:
Which is solved by recoding to UTF8, which appears to be hardcoded in Minemeld:
I suggest the basepoller is modified to support other encodings for feeds, in order to recode to UTF8, which appears to be used withing minemeld.
Does this make sense?