Polyconseil / zbarlight

A simple wrapper for zbar
BSD 3-Clause "New" or "Revised" License
163 stars 37 forks source link

data decode is Garbled #25

Closed wydonglove closed 6 years ago

wydonglove commented 6 years ago

After do method of "scan_codes" to a picture, i decode data to "utf-8" or "gbk",but both garbled. the target language is chinese. why , and how to reslove it, thanks

rbarrois commented 6 years ago

Hi,

Can you provide us with an example image and the resulting output?

wydonglove commented 6 years ago

4

that is a QR code picture 。 this is my code. ######### codes = zbarlight.scan_codes(['qrcode'],image) item = [] for code in codes: print("type(code):{},code:{}".format(type(code),code)) basetr_0 = code.decode('utf-8')

basetr1 = code.decode('GBK')

print("basetr0:%s" % basetr_0)

##########

the print is show below:

type(code):<class 'bytes'>,code:b'\xc3\x90\xc3\xad\xc2\xbf\xc3\x89\xc3\x96\xc2\xa4\xc2\xba\xc3\x85: JY25101850025950\n\r\xc2\xbe\xc2\xad\xc3\x93\xc2\xaa\xc3\x95\xc3\x9f\xc3\x83\xc3\xbb\xc2\xb3\xc3\x86: \xc3\x8c\xc3\xac\xc2\xb8\xc2\xae\xc3\x90\xc3\x82\xc3\x87\xc3\xb8\xc2\xb3\xc3\x89\xc2\xb6\xc2\xbc\xc3\x86\xc2\xac\xc3\x87\xc3\xb8\xc2\xbb\xc2\xaa\xc3\x91\xc3\xb4\xc3\x9f\xc2\xa3\xc2\xb0\xc3\x8d\xc3\x8a\xc3\x8a\xc2\xb4\xc2\xa8\xc3\x8f\xc3\xa3\xc3\x82\xc3\xa8\xc3\x82\xc3\xa8\xc2\xb2\xc3\x8b\xc2\xb2\xc3\x8d\xc2\xb9\xc3\x9d\n\r\xc3\x89\xc3\xa7\xc2\xbb\xc3\xa1\xc3\x90\xc3\x85\xc3\x93\xc3\x83\xc2\xb4\xc3\xba\xc3\x82\xc3\xab: 92510100MA6CQUTG8C\n\r\xc2\xb7\xc2\xa8\xc2\xb6\xc2\xa8\xc2\xb4\xc3\xba\xc2\xb1\xc3\xad\xc3\x88\xc3\x8b(\xc2\xb8\xc2\xba\xc3\x94\xc3\xb0\xc3\x88\xc3\x8b): \xc2\xb3\xc3\x82\xc3\xa8\xc3\xab\n\r\xc3\x93\xc3\x90\xc3\x90\xc2\xa7\xc3\x86\xc3\x9a\xc3\x8f\xc3\x9e: 2022\xc3\x84\xc3\xaa09\xc3\x94\xc3\x8224\xc3\x88\xc3\x95\n\r\xc3\x88\xc3\x95\xc2\xb3\xc2\xa3\xc2\xbc\xc3\xa0\xc2\xb6\xc2\xbd\xc2\xb9\xc3\x9c\xc3\x80\xc3\xad\xc3\x88\xc3\x8b\xc3\x94\xc2\xb1:\xc2\xb8\xc3\x9f\xc2\xbc\xc3\x91 \xc2\xbd\xc2\xad\xc3\x8e\xc3\x84\n\r\xc3\x87\xc2\xa9\xc2\xb7\xc2\xa2\xc3\x88\xc3\x8b:\xc3\x80\xc3\xae\xc2\xb2\xc2\xa8' basetr0:Ðí¿ÉÖ¤ºÅ: JY25101850025950 ¾­ÓªÕßÃû³Æ: Ì츮ÐÂÇø³É¶¼Æ¬Çø»ªÑôߣ°ÍÊÊ´¨ÏãÂèÂè²Ë²Í¹Ý Éç»áÐÅÓôúÂë: 92510100MA6CQUTG8C ·¨¶¨´ú±íÈË(¸ºÔðÈË): ³Âèë ÓÐЧÆÚÏÞ: 2022Äê09ÔÂ24ÈÕ ÈÕ³£¼à¶½¹ÜÀíÈËÔ±:¸ß¼Ñ ½­ÎÄ Ç©·¢ÈË:À

i covent bytes to str("utf-8") , the result should be chinese,but garbled

rbarrois commented 6 years ago

It looks like your data is not encoded in UTF-8, but in the GBK encoding:

>>> print(raw.decode('gbk'))
"""脨铆驴脡脰陇潞脜: JY25101850025950
戮颅脫陋脮脽脙没鲁脝: 脤矛赂庐脨脗脟酶鲁脡露录脝卢脟酶禄陋脩么脽拢掳脥脢脢麓篓脧茫脗猫脗猫虏脣虏脥鹿脻
脡莽禄谩脨脜脫脙麓煤脗毛: 92510100MA6CQUTG8C
路篓露篓麓煤卤铆脠脣(赂潞脭冒脠脣): 鲁脗猫毛
脫脨脨搂脝脷脧脼: 2022脛锚09脭脗24脠脮
脠脮鲁拢录脿露陆鹿脺脌铆脠脣脭卤:赂脽录脩 陆颅脦脛
脟漏路垄脠脣:脌卯虏篓"""
wydonglove commented 6 years ago

“gbk” encoding is also incorrect . """脨铆驴脡脰陇潞脜: JY25101850025950 ...." is garbled. the correct content is : 许可证号: JY25101850025950 经营者名称: 天府新区成都片区华阳撸巴适川香妈妈菜餐馆 .......

rbarrois commented 6 years ago

Oh, that's a problem.

Based on your input, I think you should find a clear declaration of the expected encoding for the QRCode; it seems that you're looking for an encoding that would encode CJK characters on 32 bits.

The fact that the "ascii" part went through properly would indicate that the raw QRCode parsing went well; beyond that, I am unable to help with advanced encoding systems :/