volatilityfoundation / volatility3

Volatility 3.0 development
http://volatilityfoundation.org/
Other
2.61k stars 447 forks source link

windows.filescan: UnicodeEncodeError: 'gbk' codec can't encode character '\u0cc9' in position 12: illegal multibyte sequence #637

Closed HasegawaAzusa closed 2 years ago

HasegawaAzusa commented 2 years ago

Errors: PS C:\Users\53124\Desktop> vol.exe -f .\ACTUE.raw windows.filescan > 111.txt Traceback (most recent call last):B scanning finished File "C:\Users\53124\AppData\Local\Programs\Python\Python39\lib\runpy.py", line 197, in _run_module_as_main return _run_code(code, main_globals, None, File "C:\Users\53124\AppData\Local\Programs\Python\Python39\lib\runpy.py", line 87, in _run_code exec(code, run_globals) File "C:\Users\53124\AppData\Local\Programs\Python\Python39\Scripts\vol.exe__main.py", line 7, in File "C:\Users\53124\AppData\Local\Programs\Python\Python39\lib\site-packages\volatility3\cli__init.py", line 587, in main CommandLine().run() File "C:\Users\53124\AppData\Local\Programs\Python\Python39\lib\site-packages\volatility3\cli\init__.py", line 317, in run renderers[args.renderer]().render(constructed.run()) File "C:\Users\53124\AppData\Local\Programs\Python\Python39\lib\site-packages\volatility3\cli\text_renderer.py", line 178, in render grid.populate(visitor, outfd) File "C:\Users\53124\AppData\Local\Programs\Python\Python39\lib\site-packages\volatility3\framework\renderers\init__.py", line 217, in populate accumulator = function(treenode, accumulator) File "C:\Users\53124\AppData\Local\Programs\Python\Python39\lib\site-packages\volatility3\cli\text_renderer.py", line 173, in visitor accumulator.write("{}".format("\t".join(line))) UnicodeEncodeError: 'gbk' codec can't encode character '\u0cc9' in position 12: illegal multibyte sequence

I personally feel that it is because you have not taken into account the coding conversion problem, but I have no way to fix it and hope you can fix it. I really like vol3. It is much more efficient than vol2. But because of this bug, I wasted a lot of time on this bug when I was doing memory forensics at night, and it still didn't work out, which eventually led me to use vol2 back again.

ikelos commented 2 years ago

Hi there, thanks for your issue. We output the data encoded as unicode characters, the conversion to gbk is probably due to your local terminal configuration. Sadly, gbk doesn't support all possible unicode characters, perhaps try a local terminal encoding of gb18030 instead, which apparently does support all unicode characters?

As you can see, it's possible for the character to be printed normally, but it cannot be encoded to gbk specifically:

>>> print("\u0cc9")
೉
>>> print("\u0cc9".encode('gbk'))
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
UnicodeEncodeError: 'gbk' codec can't encode character '\u0cc9' in position 0: illegal multibyte sequence

I hope you can adapt your local terminal to support all unicode characters. We don't choose any specific encoding to convert characters to and we're not keen to arbitrarily replace unsupported characters with ? since the user won't easily be able to tell whether it should have been an actual question mark, or is simply an unsupported character. Volatility 2 had a tendency to try to cover over issues like this, which made it easier for users at the cost them understanding the implications of the changes volatility was making to the output. This way you can easily identify (and hopefully fix) the problem.

I'm going to mark this as closed, because I don't believe it to be a bug in volatility, but if you still have an issue after changing the encoding of your local console, please feel free to reopen it and we can take another look. 5:)

shaoyi1998 commented 3 months ago

maybe This shouldn't be a terminal issue. Change it to the same way as UTF-8(chcp 65001), but just replace it with Vol. py. It's strange code

  a = subprocess.run(r'chcp 65001&&E:\ctf_tools\memory_analysis\assets\vol.exe -f "C:\Users\11711\Desktop\browser\browser.raw" filescan',shell=True)
  print(a.stdout)
Active code page: 65001
  Traceback (most recent call last):
    File "vol.py", line 10, in <module>
    File "volatility3\cli\__init__.py", line 877, in main
    File "volatility3\cli\__init__.py", line 469, in run
    File "volatility3\cli\text_renderer.py", line 203, in render
    File "volatility3\framework\renderers\__init__.py", line 251, in populate
    File "volatility3\cli\text_renderer.py", line 198, in visitor
  UnicodeEncodeError: 'gbk' codec can't encode character '\uf8a0' in position 13: illegal multibyte sequence
  [12900] Failed to execute script 'vol' due to unhandled exception!
shaoyi1998 commented 3 months ago

change text_renderer.py

            # old
            accumulator.write("{}".format("\t".join(line)))

to

            # new
            encoded_line = "\t".join(line).encode('utf-8', errors='replace')
            accumulator.buffer.write(encoded_line)

and pyinstaller vol.spec image