base62 decoding not working

mufeedvh / basecrack

Decode All Bases - Base Scheme Decoder

MIT License

538 stars 91 forks source link

base62 decoding not working #4

Closed beuguissime closed 4 years ago

beuguissime commented 4 years ago

Hi,

I think there is a bug in the base62 decoding attempt.

Should not the line

base62_decode = base62.encode(int(encoded_base)).decode('utf-8', 'replace')

read something like

base62_decode = str(base62.decode(encoded_base))

For instance, with base-62 from pypi I get

In [3]: base62.encode(999)
Out[3]: 'g7'

With the original basecrack.py:

$ python3 basecrack.py -b 'g7
[-] Encoded Base: g7

[>] Decoding as Base91: 

[>] Decoding as Base92: Â

With my suggested modification:

$ python3 basecrack.py -b 'g7
[-] Encoded Base: g7

[>] Decoding as Base62: 999

[>] Decoding as Base91: 

[>] Decoding as Base92: Â

[-] The Encoding Scheme Is Base62

Moreover, I find disagreeing results when I encode a integer with base-62 and with pybase62 (both taken from pypi).

mufeedvh commented 4 years ago

Hey @beuguissime, first of all, I am really sorry for the late response, this issue was buried in my notifications. :(

Nice Catch, my implementation was indeed wrong. Thank You so much for making a PR and writing a very detailed Issue, I really appreciate it! :heart::clap:

Moreover, I find disagreeing results when I encode a integer with base-62 and with pybase62 (both taken from pypi).

Seems like an issue with the same root cause. I started working on the next release and I will maybe consider making a custom library code for base62 just like I did for base92.

Again, thank you for bringing this to my attention and also fixing it with a great PR! :heart::raised_hands:

beuguissime commented 3 years ago

Hi @mufeedvh

No problem at all, we all have a busy life, especially during the pandemic. Thanks for addressing my issue and for your very nice reply.

Regarding my comment about base-62 vs pybase62, I later understood that the disagreement is a matter of convention: whether you adopt '1234567890ABCD...abcd...' or '1234567890abcd...ABCD...' as a charset. I understand that both choices are valid so I suggest basecrack should test both and report if the decoding was successful with the direct or the inverted charset.

mufeedvh commented 3 years ago

Hey @beuguissime,

Yeah that seems like a really good plan, I will implement it when I get some time off work! :+1:

Again, thank you so much for your contribution to this project. :heart::raised_hands:

r4gn4r0x commented 3 years ago

Consider using codext, it supports many base encodings, whose 45, 58, 62 and many others...