KimiNewt / pyshark

Python wrapper for tshark, allowing python packet parsing using wireshark dissectors
MIT License
2.18k stars 414 forks source link

crashed while running "for pkt in capture" - unrecognized characters #470

Open dwei98 opened 3 years ago

dwei98 commented 3 years ago

1 How to solve this issue? Python code crashed when reading the packet as shown in the attached image - Traceback (most recent call last): File "/home/osboxes/eclipse-workspace/Python_Projects/Industrial_5G_CPE_Network/TCP_Packets_Processing_rev2.py", line 104, in for packet in capture: File "/home/osboxes/eclipse-workspace/Python_Projects/Industrial_5G_CPE_Network/venv/lib/python3.6/site-packages/pyshark/capture/capture.py", line 242, in _packets_from_tshark_sync got_first_packet=packets_captured > 0)) File "/usr/lib/python3.6/asyncio/base_events.py", line 484, in run_until_complete return future.result() File "/home/osboxes/eclipse-workspace/Python_Projects/Industrial_5G_CPE_Network/venv/lib/python3.6/site-packages/pyshark/capture/capture.py", line 360, in _get_packet_from_stream packet = packet_from_xml_packet(packet, psml_structure=psml_structure) File "/home/osboxes/eclipse-workspace/Python_Projects/Industrial_5G_CPE_Network/venv/lib/python3.6/site-packages/pyshark/tshark/tshark_xml.py", line 26, in packet_from_xml_packet xml_pkt = lxml.objectify.fromstring(xml_pkt, parser) File "src/lxml/objectify.pyx", line 1808, in lxml.objectify.fromstring File "src/lxml/etree.pyx", line 3237, in lxml.etree.fromstring File "src/lxml/parser.pxi", line 1896, in lxml.etree._parseMemoryDocument File "src/lxml/parser.pxi", line 1784, in lxml.etree._parseDoc File "src/lxml/parser.pxi", line 1141, in lxml.etree._BaseParser._parseDoc File "src/lxml/parser.pxi", line 615, in lxml.etree._ParserContext._handleParseResultDoc File "src/lxml/parser.pxi", line 725, in lxml.etree._handleParseResult File "src/lxml/parser.pxi", line 654, in lxml.etree._raiseParseError File "", line 188 lxml.etree.XMLSyntaxError: Input is not proper UTF-8, indicate encoding ! Bytes: 0xDE 0xEF 0xBE 0xAD, line 188, column 146

blueskylan commented 3 years ago

I got same issue, I google this issue and find a link(https://osqa-ask.wireshark.org/questions/51447/pyshark-crashes-due-to-no-proper-utf8-encoding/), but when I follow this solution by adding new code(added following line between line 26 + 27 in src\pyshark\tshark\tshark_xml.py, xml_pkt = xml_pkt.decode('latin-1'), python setup.py install), I got second issue(AttributeError: no such child: decode). so do you know how to solve the second issue, thanks!

misterzed88 commented 3 years ago

Looks to be a duplicate of issue #116. Created pull request #479 for a fix.

S4lt5 commented 3 years ago

This did the trick for me, I was looking for specifically malformed packets and it would die on me.

Thanks!