infinite loops in sockets/connections

GoogleCodeExporter commented 8 years ago

This is copied from an email and should describe the problem:

* we have a few win2003 x86 that put connections and sockets plugins
in an infinite loop. that's because in network.py we get an array of
objects and then walk a singly-link list for each object. in these
rare cases, instead of a normal list like A->B->0 (where 0 is the end)
it looks like A<->B (the Object.Next member points the wrong way) and
thus we go back and forth forever. probably we could benefit from a
seen=set() and then break if seen before yielding.

# this loop can be infinite bc we don't check for self-referencing or
backward pointing sock.Next
while sock.is_valid():
      yield sock
      sock = sock.Next

Original issue reported on code.google.com by michael.hale@gmail.com on 11 Apr 2012 at 3:35

GoogleCodeExporter commented 8 years ago

This issue was closed by revision r1619.

Original comment by mike.auty@gmail.com on 11 Apr 2012 at 4:06

Changed state: Fixed

GoogleCodeExporter commented 8 years ago

Hmm after this patch, there are 2593 entries printed:

$ python vol.py -f ~/Desktop/PhysMem.bin --profile=Win2003SP2x86 sockets | wc -l
Volatile Systems Volatility Framework 2.1_alpha
   2593

Strangely, some of the physical offsets are the same in the output, although 
they have different pids, ports, and creation times. Not sure how that's 
possible?

0x0892a008    484      0    255 Reserved       0.0.0.0            2012-03-19 
08:18:43       
0x0892a008   1436  36256     17 UDP            0.0.0.0            2012-03-19 
08:18:41

0x08a55d00   1436  18524     17 UDP            0.0.0.0            2012-03-19 
08:18:40       
0x08a55d00   1436  62214     17 UDP            0.0.0.0            2012-03-19 
08:18:40

0x089b0540   1436  32169     17 UDP            0.0.0.0            2012-03-19 
08:18:41       
0x089b0540   1436   8782     17 UDP            0.0.0.0            2012-03-19 
08:18:40

I worked up a patch that uses sock.obj_offset as the "seen" criteria instead of 
just sock:

Index: volatility/win32/network.py
===================================================================
--- volatility/win32/network.py        (revision 1619)
+++ volatility/win32/network.py        (working copy)
@@ -181,7 +181,7 @@
                        for entry in table:
                            sock = entry.dereference()
                            seen = set()
-                            while sock.is_valid() and sock not in seen:
+                            while sock.is_valid() and sock.obj_offset not in 
seen:
                                yield sock
-                                seen.add(sock)
+                                seen.add(sock.obj_offset)
                                sock = sock.Next

With that patch, it prints a much more sane number of entries:

$ python vol.py -f ~/Desktop/PhysMem.bin --profile=Win2003SP2x86 sockets | wc -l
Volatile Systems Volatility Framework 2.1_alpha
    259

We also should do the same thing for connections.

Original comment by michael.hale@gmail.com on 11 Apr 2012 at 4:58

Changed state: Started

GoogleCodeExporter commented 8 years ago

Ok, r1620 is a second take at trying to fix this.  Note, we may have two 
different conn objects with the same offset if they inhabit different vm's, but 
since the new objects are always formed from a single initial starting object, 
the vm's should never change.

I still need to investigate the differing data/same offset lines.  I'll have to 
check whether any of my images fall to that problem though, otherwise I won't 
have any data to test against...

Original comment by mike.auty@gmail.com on 11 Apr 2012 at 8:12

GoogleCodeExporter commented 8 years ago

Sounds good thanks Mike. The r1620 patch does prevent the loops in both 
connections and sockets, so that much is OK. 

One thing I wanted to document is that this is quite rare. I looked at the pool 
tag for the _ADDRESS_OBJECT (socket) that was causing an infinite loop and 
found that it is *not* TCPA which is what you'd expect. That means the object 
is in the tcpip.sys hash bucket of socket objects *or* it is linked to from 
another object in the hash bucket, but its no longer an actual _ADDRESS_OBJECT. 
I guess this can happen if the target machine is thrashing and quickly 
creating/deleting sockets at the time the memory dump was taken. 

So one thing to keep in mind is that we *could* add a constraint to the sockets 
plugin to calculate the location of _POOL_HEADER given a sock.obj_offset, and 
then check if its TCPA. It would cause sockets to skip the invalid entry, but 
at the same time we'd be opening ourselves up to another non-essential 
constraint where attackers could change the pool tag of their _ADDRESS_OBJECT 
and it wouldn't be reported (that would probably never happen, but you know 
what I mean). 

Mike, if you're OK with closing this, I am too.

Original comment by michael.hale@gmail.com on 11 Apr 2012 at 11:37

GoogleCodeExporter commented 8 years ago

Sounds good thanks Mike. The r1620 patch does prevent the loops in both 
connections and sockets, so that much is OK. 

One thing I wanted to document is that this is quite rare. I looked at the pool 
tag for the _ADDRESS_OBJECT (socket) that was causing an infinite loop and 
found that it is *not* TCPA which is what you'd expect. That means the object 
is in the tcpip.sys hash bucket of socket objects *or* it is linked to from 
another object in the hash bucket, but its no longer an actual _ADDRESS_OBJECT. 
I guess this can happen if the target machine is thrashing and quickly 
creating/deleting sockets at the time the memory dump was taken. 

So one thing to keep in mind is that we *could* add a constraint to the sockets 
plugin to calculate the location of _POOL_HEADER given a sock.obj_offset, and 
then check if its TCPA. It would cause sockets to skip the invalid entry, but 
at the same time we'd be opening ourselves up to another non-essential 
constraint where attackers could change the pool tag of their _ADDRESS_OBJECT 
and it wouldn't be reported (that would probably never happen, but you know 
what I mean). 

Mike, if you're OK with closing this, I am too.

Original comment by michael.hale@gmail.com on 11 Apr 2012 at 11:37

GoogleCodeExporter commented 8 years ago

Yep, happy to close it.  As you said, it's a rare case, and adding in a 
constraint for a rare case that we now handle fairly well doesn't seem all that 
pointful...

Original comment by mike.auty@gmail.com on 11 Apr 2012 at 11:43

Changed state: Fixed

GoogleCodeExporter commented 8 years ago

I have to reopen this, because our patch is preventing a lot of valid sockets 
(maybe connections too, I haven't checked yet) from being shown. Here's a 
comparison of volatility 2.0 vs the current branch for sockets:

$ python vol.py -f ~/Desktop/memory/silentbanker.vmem sockets
Volatile Systems Volatility Framework 2.0
 Offset(V)  PID    Port   Proto               Address        Create Time               
---------- ------ ------ ------------------- -------------- 
-------------------------- 
0x816b0ba0   1876   1274      6 TCP            0.0.0.0            2009-02-18 
06:55:23       
0x81493e98   1876   1282      6 TCP            0.0.0.0            2009-02-18 
06:55:24       
0x81493e98   1156   1900     17 UDP            192.168.128.128    2008-12-11 
20:51:52       
0x814523a8   1876   1273     17 UDP            127.0.0.1          2009-02-18 
06:55:20       
0x814523a8    740    500     17 UDP            0.0.0.0            2008-09-18 
05:33:19       
0x814fb158      4    139      6 TCP            192.168.128.128    2008-12-11 
20:51:51       
0x814fb158   1876   1290      6 TCP            0.0.0.0            2009-02-18 
06:55:26       
0x8141fbc0   1876   1294      6 TCP            0.0.0.0            2009-02-18 
06:55:29       
0x82004610      4    445      6 TCP            0.0.0.0            2008-09-18 
05:32:51       
0x8215e008    972    135      6 TCP            0.0.0.0            2008-09-18 
05:32:59       
0x814a0af0   1876   1275      6 TCP            0.0.0.0            2009-02-18 
06:55:23       
0x81468c68   1876   1279      6 TCP            0.0.0.0            2009-02-18 
06:55:24       
0x81468c68      4    137     17 UDP            192.168.128.128    2008-12-11 
20:51:51       
0x81477388   1876   1283      6 TCP            0.0.0.0            2009-02-18 
06:55:25       
0x81540e98   1876   1287      6 TCP            0.0.0.0            2009-02-18 
06:55:25       
0x81422e98   1876   1295      6 TCP            0.0.0.0            2009-02-18 
06:55:29       
0x81422e98   1320   1029      6 TCP            127.0.0.1          2008-09-18 
05:33:29       
0x8142ac38   1876   1299      6 TCP            0.0.0.0            2009-02-18 
06:55:30       
0x8142ac38   1064    123     17 UDP            127.0.0.1          2008-12-11 
20:51:52       
0x81709968    740      0    255 Reserved       0.0.0.0            2008-09-18 
05:33:19       
0x822d9e98   1112   1025     17 UDP            0.0.0.0            2008-09-18 
05:33:28       
0x81489e98   1876   1280      6 TCP            0.0.0.0            2009-02-18 
06:55:24       
0x81489e98   1112   1033     17 UDP            0.0.0.0            2008-09-18 
05:42:19       
0x8195a248   1876   1276      6 TCP            0.0.0.0            2009-02-18 
06:55:23       
0x81cef968   1876   1284      6 TCP            0.0.0.0            2009-02-18 
06:55:25       
0x81cef968      4    138     17 UDP            192.168.128.128    2008-12-11 
20:51:51       
0x8180da28   1876   1288      6 TCP            0.0.0.0            2009-02-18 
06:55:25       
0x8191d5f8   1112   1115     17 UDP            0.0.0.0            2008-12-11 
18:54:24       
0x8191d5f8   1064    123     17 UDP            192.168.128.128    2008-12-11 
20:51:52       
0x81480e98   1876   1277      6 TCP            0.0.0.0            2009-02-18 
06:55:24       
0x81480e98   1876   1281      6 TCP            0.0.0.0            2009-02-18 
06:55:24       
0x8148be98   1156   1900     17 UDP            127.0.0.1          2008-12-11 
20:51:52       
0x814c2a08   1876   1285      6 TCP            0.0.0.0            2009-02-18 
06:55:25       
0x814c2a08    740   4500     17 UDP            0.0.0.0            2008-09-18 
05:33:19       
0x821a3878   1876   1289      6 TCP            0.0.0.0            2009-02-18 
06:55:25       
0x8142b5b8   1876   1293      6 TCP            0.0.0.0            2009-02-18 
06:55:28       
0x8142b5b8      4    445     17 UDP            0.0.0.0            2008-09-18 
05:32:51       
0x8141c1d8   1876   1297      6 TCP            0.0.0.0            2009-02-18 
06:55:30 

$ python vol.py -f ~/Desktop/memory/silentbanker.vmem sockets
Volatile Systems Volatility Framework 2.1_alpha
 Offset(V)  PID    Port   Proto               Address        Create Time               
---------- ------ ------ ------------------- -------------- 
-------------------------- 
0x816b0ba0   1876   1274      6 TCP            0.0.0.0            2009-02-18 
06:55:23       
0x81493e98   1876   1282      6 TCP            0.0.0.0            2009-02-18 
06:55:24       
0x814523a8   1876   1273     17 UDP            127.0.0.1          2009-02-18 
06:55:20       
0x814fb158      4    139      6 TCP            192.168.128.128    2008-12-11 
20:51:51       
0x8141fbc0   1876   1294      6 TCP            0.0.0.0            2009-02-18 
06:55:29       
0x82004610      4    445      6 TCP            0.0.0.0            2008-09-18 
05:32:51       
0x8215e008    972    135      6 TCP            0.0.0.0            2008-09-18 
05:32:59       
0x814a0af0   1876   1275      6 TCP            0.0.0.0            2009-02-18 
06:55:23       
0x81468c68   1876   1279      6 TCP            0.0.0.0            2009-02-18 
06:55:24       
0x81540e98   1876   1287      6 TCP            0.0.0.0            2009-02-18 
06:55:25       
0x81422e98   1876   1295      6 TCP            0.0.0.0            2009-02-18 
06:55:29       
0x8142ac38   1876   1299      6 TCP            0.0.0.0            2009-02-18 
06:55:30       
0x822d9e98   1112   1025     17 UDP            0.0.0.0            2008-09-18 
05:33:28       
0x81489e98   1876   1280      6 TCP            0.0.0.0            2009-02-18 
06:55:24       
0x81cef968   1876   1284      6 TCP            0.0.0.0            2009-02-18 
06:55:25       
0x8180da28   1876   1288      6 TCP            0.0.0.0            2009-02-18 
06:55:25       
0x8191d5f8   1112   1115     17 UDP            0.0.0.0            2008-12-11 
18:54:24       
0x81480e98   1876   1277      6 TCP            0.0.0.0            2009-02-18 
06:55:24       
0x814c2a08   1876   1285      6 TCP            0.0.0.0            2009-02-18 
06:55:25       
0x821a3878   1876   1289      6 TCP            0.0.0.0            2009-02-18 
06:55:25       
0x8142b5b8   1876   1293      6 TCP            0.0.0.0            2009-02-18 
06:55:28       
0x8141c1d8   1876   1297      6 TCP            0.0.0.0            2009-02-18 
06:55:30

We might consider reverting our patch in r1620 until we can figure out a better 
way to prevent infinite loops without short circuiting early.

Original comment by michael.hale@gmail.com on 16 May 2012 at 1:39

Changed state: Started

GoogleCodeExporter commented 8 years ago

Original comment by michael.hale@gmail.com on 16 May 2012 at 1:46

Added labels: Priority-High
Removed labels: Priority-Medium

GoogleCodeExporter commented 8 years ago

Hi Michael,
   I also came across a similar issue I think, but I found out that
the duplicates are actually because the scanner scans the virtual
address space and many of these pages are mapped into several virtual
addresses (but have the same physical addresses). Can you add an
additional column to the plugin to show the physical address and check
if these are all just duplicates?

Original comment by scude...@gmail.com on 16 May 2012 at 1:51

GoogleCodeExporter commented 8 years ago

Hey Scudette. Hmm, I'm a little confused...we're talking about sockets not 
sockscan, so there's no scanning involved. If talking about sockscan, that 
scans physical space. Also there's already a --physical option to the sockets 
command and that doesn't seem to be the issue. In comment #4 above I explained 
what was happening with this particular win2003 x86 sample (one of the hash 
buckets or singly-linked lists points to an object that isn't really a TCPA). 
Do you have any details on your similar issue? Maybe we could compare and see.

Original comment by michael.hale@gmail.com on 16 May 2012 at 2:09

GoogleCodeExporter commented 8 years ago

Original comment by michael.hale@gmail.com on 16 May 2012 at 2:14

Added labels: Milestone-2.1.x

GoogleCodeExporter commented 8 years ago

Ah I'm sorry Michael I must have got confused with sockscan.

The interesting thing in the volatility 2.0 dump you showed is that the 
duplicate lines do in fact have the same address, yet they are printing 
different content each time.

Closer examination of the code here

http://code.google.com/p/volatility/source/browse/trunk/volatility/win32/network
.py#184

shows that in vol2.0 we were using all the tcpip.sys version tables 
indiscriminately (well providing the size worked out > 0 which is pretty likely 
for random data). Hence the old code would print mostly rubbish in addition to 
the correct data.

The new code only prints the first random entry it finds which might well be 
rubbish as well because it was parsed using the wrong table.

As Gleeda pointed out elsewhere tcpip.sys versions are not that specific to the 
OS version. So I think we need to tailored these tables to the tcpip.sys dll 
version or GUID.

Original comment by scude...@gmail.com on 16 May 2012 at 2:41

GoogleCodeExporter commented 8 years ago

No problem! 

> The interesting thing in the volatility 2.0 dump you showed is that the 
duplicate 
> lines do in fact have the same address, yet they are printing different 
content each time.

Yes, but that's due to a completely different reason. See Issue #258. Its 
because we're not dereferencing the sock.Next or conn.Next pointers. I'm about 
to fix that separate issue. 

True, I think we need to get rid of the hard-coded offsets in tcpip.sys 
altogether, but once again that's a different issue. In the case of this 
win2003 x86 the right offsets are used and the correct table is found - but we 
still have an infinite loop. So even if we tailer tcpip.sys to version or GUID, 
the same infinite loop is still going to occur. I've long wanted to get rid of 
the hard-coded offsets, trust me ;-)

Original comment by michael.hale@gmail.com on 16 May 2012 at 2:46

GoogleCodeExporter commented 8 years ago

I am still not sure I understand this. Reading the code we select a set of 
table definitions based on the profile (module_versions). Then for each of 
those we dereference the connection table according to the hard coded address 
(table_addr as a hard coded TCBTableOff), then we parse each member of these 
tables as the TCB table and yield each address as a valid connection.

There does not seem to be very much sanity checking that we are actually 
finding the right table other than the size > 0? If the table is incorrect we 
will be reading the wrong offsets to the TCB array and dereference the wrong 
thing?

In addition at different iterations we can use different tables on the same 
offsets returning rubbish offsets. Maybe this is why we get an infinite loop?

Original comment by scude...@gmail.com on 16 May 2012 at 3:05

GoogleCodeExporter commented 8 years ago

Ah, bling! I think the fix for Issue #258 is going to fix this too. Just to 
clarify, the Win2003 x86 infinite loop problem was properly fixed in r1620. I 
no longer think we need to revert that patch. I thought there was a problem 
with it because my XPSP2 x86 was missing sockets, however its just because 
we're not dereferencing sock.Next, which actually leads to the objects 
appearing to be duplicates and thus the command short circuits thinking its 
found an infinite loop. The objects appear like duplicates because the Next 
member is at offset 0 of _ADDRESS_OBJECT thus sock.obj_offset and 
sock.Next.obj_offset are the same (but sock.obj_offset != 
sock.Next.dereference().obj_offset)

So I think the single patch to kill Issue #244 and Issue #258 is this:

$ svn diff
Index: volatility/win32/network.py
===================================================================
--- volatility/win32/network.py (revision 1707)
+++ volatility/win32/network.py (working copy)
@@ -168,7 +168,7 @@
                             while conn.is_valid() and conn.obj_offset not in seen:
                                 yield conn
                                 seen.add(conn.obj_offset)
-                                conn = conn.Next
+                                conn = conn.Next.dereference()

 def determine_sockets(addr_space):
     """Determines all sockets for each module"""
@@ -207,4 +207,4 @@
                             while sock.is_valid() and sock.obj_offset not in seen:
                                 yield sock
                                 seen.add(sock.obj_offset)
-                                sock = sock.Next
+                                sock = sock.Next.dereference()

Original comment by michael.hale@gmail.com on 16 May 2012 at 3:07

GoogleCodeExporter commented 8 years ago

> There does not seem to be very much sanity checking that we are actually 
finding
> the right table other than the size > 0?

Your understanding is correct. Well, table_size has to be > 0, table_addr has 
to be a valid address, and then each entry pointer is dereferenced and those 
must be valid as well. So its a little more than just table_size > 0 but like I 
said before, you're absolutely right - these plugins are the only ones in our 
entire framework based on hard-coded RVAs per module. 

That said, I'm not aware of any memory dumps where the sockets or connections 
plugins choose the wrong table or print additional bogus information due to the 
RVAs being hard-coded. So while we could always benefit from a more reliable 
way of finding the table, the current implementation has not proven to be 
unreliable. 

> In addition at different iterations we can use different tables on the same 
offsets
> returning rubbish offsets. Maybe this is why we get an infinite loop?

In comment #4 above I explained what was happening with this particular win2003 
x86 sample (one of the hash buckets or singly-linked lists points to an object 
that isn't really a TCPA). That is the reason we get an infinite loop for that 
sample. Its the only case of an infinite loop we've seen, with exception of the 
similar issue you mentioned in comment #9. Were you able to get any details 
about that?

Original comment by michael.hale@gmail.com on 16 May 2012 at 3:46

GoogleCodeExporter commented 8 years ago

OK guys, here's my plan: 

I'm going to apply the patch to change sock.Next to sock.Next.dereference(). 
The following is going to happen: 

* Its going to fix the missing xpsp2 sockets reported in comment #7
* Its going to fix the duplicate offset with different info reported in issue 
#258 and discussed by scudette and me in comment #12 and #13
* The status of the infinite loop stuff is not going to change, the patch in 
r1620 is still OK 

I'll open a separate issue for us to discuss removing the hard-coded offsets or 
figuring out some other sanity checks to apply.

Original comment by michael.hale@gmail.com on 16 May 2012 at 4:36

GoogleCodeExporter commented 8 years ago

This issue was closed by revision r1710.

Original comment by michael.hale@gmail.com on 16 May 2012 at 4:37

Changed state: Fixed

GoogleCodeExporter commented 8 years ago

This issue was closed by revision r1722.

Original comment by mike.auty@gmail.com on 18 May 2012 at 10:15

Cloudxtreme / volatility

infinite loops in sockets/connections #244