SMBv1 stack doesn't talk unicode on the wire

GoogleCodeExporter commented 9 years ago

Hi mate,

sorry, another encoding / decoding issue... again... :)

This one can be triggered on Windows systems prior to Vista only (2000, XP, 
2003, etc.) because it only affects the old SMBv1 stack implemented in smb.py 
(the issue doesn't affect smb3.py). To trigger the bug on recent systems, you 
have to force the SMB Dialect to SMB_DIALECT.

I found this bug while browsing a remote filesystem with filenames created 
using a non "ASCII derivated" code page like cp866 for cyrillic charset (don't 
ask any questions concerning this remote filesystem :]). 

By the way, the default code page on the remote server wasn't cp866 (this file 
probably came from a previous windows migration or something like that).

How to reproduce the bug?:
+++++++++++++++++++++++++

1. create a directory on the remote server using code page 866:

C:> chcp 866
C:> mkdir Ь

Ь = \x9c in cp866 and \u042C in unicode

2. connect to the remote server using smbclient.py

$> python smbclient.py 
Impacket v0.9.13-dev - Copyright 2002-2014 Core Security Technologies

Type help for list of commands
# open 1.2.3.4
[*] SMBv1 dialect used
# login DOMAIN/Administrator
Password:
[*] USER Session Granted
# use C$
# ls
-rw-rw-rw-          0  Thu Jun 21 15:01:57 2012 AUTOEXEC.BAT
-rw-rw-rw-        211  Thu Jun 21 14:57:29 2012 boot.ini
-rw-rw-rw-          0  Thu Jun 21 15:01:57 2012 CONFIG.SYS
drw-rw-rw-          0  Mon Nov 18 11:28:12 2013 Documents and Settings
drw-rw-rw-          0  Tue Sep  2 18:06:04 2014 Ê? <----------------------- 
here is the weird directory
-rw-rw-rw-          0  Thu Jun 21 15:01:57 2012 IO.SYS
-rw-rw-rw-          0  Thu Jun 21 15:01:57 2012 MSDOS.SYS
drw-rw-rw-          0  Tue Sep 24 13:38:20 2013 MSOCache
-rw-rw-rw-      47564  Mon Apr 14 13:59:59 2008 NTDETECT.COM
-rw-rw-rw-     250048  Mon Apr 14 13:59:59 2008 ntldr
-rw-rw-rw- 1610612736  Mon Nov 18 11:34:04 2013 pagefile.sys
drw-rw-rw-          0  Wed Sep 25 00:13:50 2013 Program Files
drw-rw-rw-          0  Tue Sep 24 22:17:15 2013 RECYCLER
drw-rw-rw-          0  Thu Jun 21 15:20:23 2012 System Volume Information
drw-rw-rw-          0  Tue Oct 29 15:10:19 2013 temp
drw-rw-rw-          0  Wed Sep  3 18:50:13 2014 WINDOWS

3. try to browse the directory

# cd Ь
[!] SMB SessionError: STATUS_OBJECT_NAME_NOT_FOUND(The object name is not found.
# cd Ê?
[!] SMB SessionError: STATUS_OBJECT_NAME_INVALID(The object name is invalid.)

As you can see, you can't browse the directory because your SMB stack doesn't 
properly encode the filename before sending it on the wire.

Where is the issue located?:
++++++++++++++++++++++++++++

After some investigations, it seems that your SMBv1 stack talks to the remote 
server using non-unicode strings unlike smb3.py (technically, the flag 
SMB.FLAGS2_UNICODE is not set in the "Flags2" parameter). Consequently, when 
your stack receives or sends a packet, all strings (including filenames) are 
encoded using a "by default" encoding like cp1252 or something like that. 

So, if the original filename's encoding isn't an "ASCII derivated" one, your 
stack won't properly convert the directory's name before sending it on the wire 
and finally, the server won't be able to find the require directory...

In the previous case, create a directory called "Ь" using cp866 will produce 
this binary string on the remote server: \x9c. 

But when you will list the directory containing this file, the server will sent 
the following binary string on the wire: \xca\x3f\x00 (trouble begins...). 

Then, when you will try to browse the directory using the SMBv1 stack, the 
following binary string will be sent on the wire: \x5c\xd0\xac\x00 (it becames 
anything and everything except the right name...). From the remote server's 
point of view, this filename doesn't mean anything and return an error.

This issue doesn't affect smb3.py because this stack talks unicode with the 
remote server and even if your filename is fucked up, they will understand each 
other.

How to fix it:
++++++++++++++

There is no easy way to fix the issue but I think that standardize SMBv1 and 
SMBv2 stacks should be the best solution. By "standardize", I mean make sure 
that your both APIs talks Unicode on the wire like smb3.py already does. By 
talking Unicode, open file with a name encoded with a pretty weird code page 
won't break your SMB stack.

The patch:
+++++++++

Here is my patch to make it work. It prevents encoding / decoding problem in 
your SMBv1 stack even if filenames use weird and historical code pages.

After several tests, it doesn't break your examples. I just change smbclient.py 
a little to convert input paths to unicode before sending it to your SMB stack:

$> python smbclient.py 
Impacket v0.9.13-dev - Copyright 2002-2014 Core Security Technologies

Type help for list of commands
# open 1.2.3.4
[*] SMBv1 dialect used
# login DOMAIN/Administrator
Password:
[*] USER Session Granted
# use C$
# ls
-rw-rw-rw-          0  Thu Jun 21 15:01:57 2012 AUTOEXEC.BAT
-rw-rw-rw-        211  Thu Jun 21 14:57:29 2012 boot.ini
-rw-rw-rw-          0  Thu Jun 21 15:01:57 2012 CONFIG.SYS
drw-rw-rw-          0  Mon Nov 18 11:28:12 2013 Documents and Settings
drw-rw-rw-          0  Tue Sep  2 18:06:04 2014 ╨м <----------------------- 
here is the weird directory
-rw-rw-rw-          0  Thu Jun 21 15:01:57 2012 IO.SYS
-rw-rw-rw-          0  Thu Jun 21 15:01:57 2012 MSDOS.SYS
drw-rw-rw-          0  Tue Sep 24 13:38:20 2013 MSOCache
-rw-rw-rw-      47564  Mon Apr 14 13:59:59 2008 NTDETECT.COM
-rw-rw-rw-     250048  Mon Apr 14 13:59:59 2008 ntldr
-rw-rw-rw- 1610612736  Mon Nov 18 11:34:04 2013 pagefile.sys
drw-rw-rw-          0  Wed Sep 25 00:13:50 2013 Program Files
drw-rw-rw-          0  Tue Sep 24 22:17:15 2013 RECYCLER
drw-rw-rw-          0  Thu Jun 21 15:20:23 2012 System Volume Information
drw-rw-rw-          0  Tue Oct 29 15:10:19 2013 temp
drw-rw-rw-          0  Thu Sep  4 11:42:42 2014 WINDOWS
# cd ╨м
# ls
drw-rw-rw-          0  Thu Sep  4 11:43:28 2014 .
drw-rw-rw-          0  Thu Sep  4 11:43:28 2014 ..

As you can see, display is not pretty good because my remote server doesn't use 
the cp866 code page as a default. So, when it tries to convert the filename to 
unicode in order to send it on the wire, it fails and sends an invalid unicode 
string. Technically, it tries to convert a filename created using cp866 to 
unicode using its default code page (I think it's cp1252) and it doesn't work 
obviously.

But with this patch, it doesn't matter because client and server understand 
each other. When I try to browse the cyrillic directory, I send the invalid 
unicoded filename that the remote server sent me before when I listed the 
parent directory. The remote server will land on its feet because it will try 
to decode the invalid unicoded filename using its default code page and will 
finally come back to the cp866 filename.

However, this patch will propably break SMB connections with pre-2000 systems 
(I think that they don't talk unicode on the wire...). Let the developer choose 
the encoding to use on the wire may be a solution.

Anyway, tell me what do you think about this patch. If you have any question, 
improvement or if you didn't understand anything concerning my speech, please 
let me know ;)

Original issue reported on code.google.com by renaud.d...@synacktiv.com on 4 Sep 2014 at 11:09

Attachments:

impacket.patch

GoogleCodeExporter commented 9 years ago

Hola mate!

Sorry for the delay response.. you rock!.. Let me dig your mail and I'll get 
back to you...Big thanks for taking a look at it (SMBv1 code is ugly).. and 
yes.. it does not support Unicode connections. Time to attack this issue :)

cheers!
beto

Original comment by bet...@gmail.com on 8 Sep 2014 at 1:36

Changed state: Accepted
Added labels: ****
Removed labels: ****

GoogleCodeExporter commented 9 years ago

Hmm.. still trying to repro this issue based on your repro steps.

I've used a Windows 7 as the target.

1. chcp 866
2. mkdir b

then I forced smbclient.py to connect using smbv1 (change smbconnection.py and 
force the SMBConnection.__init__() preferredDialect to SMB_DIALECT

I connected to the target system.. smbclient.py is telling me I'm with smbv1 
and I see the directory as 'b'.

Mate.. did you take the same steps as myself?.. I might be missing something.

Original comment by bet...@gmail.com on 16 Sep 2014 at 7:27

Added labels: ****
Removed labels: ****

GoogleCodeExporter commented 9 years ago

Be careful, in the "mkdir" command, it's not a "b" but a "Ь" (the cyrillic 
character). In your case, it still works because "b" is inherently and 
correctly converted to the ascii char code \x62 (cp866 and ascii table are the 
same for the ascii charset).

To reproduce the bug you have to create a directory or a file containing 
non-ascii char. The simpliest way is to create the char with Python and copy / 
paste the char in your Windows shell:

>>> print u"\u042c"
Ь

Tell me if it fixes your problem ;)

Original comment by renaud.d...@synacktiv.com on 17 Sep 2014 at 6:10

Added labels: ****
Removed labels: ****

GoogleCodeExporter commented 9 years ago

dumb me.. 

thanks for the clarification.. just managed to create that directory..

First problem I found is smbclient.py dumps an exception, even when running SMB 
v2.1 against the target.. And that has to do with the smbclient.py itself.  
Doesn't that happen to you? .. I'm running smbclient.py from OSx

Original comment by bet...@gmail.com on 17 Sep 2014 at 5:51

Added labels: ****
Removed labels: ****

GoogleCodeExporter commented 9 years ago

Ok.. just saw your precmd addition into smbclient.py to prevent this problem 
happening with SMB >= v2. I'm first committing that change.

For the rest, I want to carefully look at your changes since it might break 
many things .. It's gonna take some time.

thanks again!
beto

Original comment by bet...@gmail.com on 17 Sep 2014 at 6:28

Added labels: ****
Removed labels: ****

GoogleCodeExporter commented 9 years ago

It's ok, I probably missed something during my patching madness :) so, I think 
it's a good thing to carefully look at my changes. I don't have an overall view 
of the project and, yeah, it probably breaks something somewhere...

Anyway, if you need some help or if you have any question about my patch, 
please let me know ;)

Thanks to you for taking the time!

Renaud.

Original comment by renaud.d...@synacktiv.com on 18 Sep 2014 at 6:38

Added labels: ****
Removed labels: ****

GoogleCodeExporter commented 9 years ago

Nooo.. thank you for taking the time to provide a patch (and specially mess 
with smb.py..)

FYI, as far as I read, SMB_COM_TREE_CONNECT shouldn't encode unicode, from 
[MS-CIFS], 2.2.4.50.1:

Flags2 (2 bytes): The SMB_FLAGS2_UNICODE flag bit SHOULD be zero. Servers MUST 
ignore the SMB_FLAGS2_UNICODE flag and interpret strings in this request as 
OEM_STRING strings.<74>

Path (variable): A null-terminated string that represents the server and share 
name of the resource to which the client is attempting to connect. This field 
MUST be encoded using Universal Naming Convention (UNC) syntax. The string MUST 
be a null-terminated array of OEM characters, even if the client and server 
have negotiated to use Unicode strings.

This doesn't apply to SMB_COM_TREE_CONNECT_ANDX.

Let me know if you read something different.

thanks again!
beto

Original comment by bet...@gmail.com on 18 Sep 2014 at 12:21

Added labels: ****
Removed labels: ****

GoogleCodeExporter commented 9 years ago

Oh yes you're right. All my fault :) when I patched smb.py, I read the 
documentation about SMB_COM_TREE_CONNECT_ANDX only and not 
SMB_COM_TREE_CONNECT... In my head, both could use Unicode... sorry about that 
and thanks for the feedback.

Renaud.

Original comment by renaud.d...@synacktiv.com on 19 Sep 2014 at 9:10

Added labels: ****
Removed labels: ****

pombreda / impacket

SMBv1 stack doesn't talk unicode on the wire #51