In class androguard.core.bytecodes.apk.StringBlock, the class methods decode()
and decode2() incorrectly decode the byte array into a valid unicode string.
The code currently converts each byte into a unicode character so long as the
byte is within the valid ascii range. Any byte value outside of the valid
ascii range is ignored. Consequently, the translated unicode string contains
only ascii values.
This is incorrect because the byte array is encoded in either utf-8 or utf-16,
which may contain legitimate code points outside the ascii range. Each byte
within the byte array should be appended to the 'data' string without
modification. Then the 'data' string can be decoded into a unicode string
using the appropriate encoding.
A simple solution would be to replace the following lines in both methods:
t_data = pack("=b", self.m_strings[offset + i])
data += unicode(t_data, errors='ignore')
with:
data += pack("=b", self.m_strings[offset + i])
What steps will reproduce the problem?
1. Get an ARSCParser object from an android application containing a
resources.arsc file with resource string values containing unicode code points
outside of the ascii range (e.g. russian, chinese, arab code blocks).
2. Write the results of a call to the method get_strings_resources() to a local
xml file
3. Open the xml file in a web browser and examine the string values
What is the expected output? What do you see instead?
The expected output should display the unicode characters outside of the ascii
range correctly. Instead, unicode characters outside of the ascii range were
omitted altogether.
What version of the product are you using? On what operating system?
androguard-1.9 on xubuntu 14.04
Please provide any additional information below.
I applied the proposed solution to the source, re-built and re-installed
androguard. I re-ran my test android apk and verified that the output xml file
contains resource string values with the the appropriate unicode characters.
Original issue reported on code.google.com by kristo...@gmail.com on 11 Sep 2014 at 2:16
Original issue reported on code.google.com by
kristo...@gmail.com
on 11 Sep 2014 at 2:16