limifly / pypcap

Exported from
http://code.google.com/p/pypcap
Other
0 stars 0 forks source link

readpkts will return the data occupy the same memory address #26

Open GoogleCodeExporter opened 9 years ago

GoogleCodeExporter commented 9 years ago
What steps will reproduce the problem?
1. prepare a capture packet,the pkt in file more than one.
2. p=pcap.pcap(file);pkts=readpkts();print(pkts)
3. you will find the each of pkts buffer ptr point to the same address

What is the expected output? What do you see instead?
expect all the pkts buffer is not the same.but got the result as this:
[(1276937965.531224, <read-only buffer ptr 0x00F7004A, size 81 at 0x00C94E20>),
(1276937972.8316729, <read-only buffer ptr 0x00F7004A, size 124 at 0x00C94F20>),
 (1276937973.2153289, <read-only buffer ptr 0x00F7004A, size 122 at 0x00C94F40>)
, (1276937973.2716949, <read-only buffer ptr 0x00F7004A, size 118 at 0x00C94F60>
), (1276937973.3154581, <read-only buffer ptr 0x00F7004A, size 155 at 0x00C94F80
>), (1276937973.3189869, <read-only buffer ptr 0x00F7004A, size 163 at 0x00C94FA
0>), (1276937973.331707, <read-only buffer ptr 0x00F7004A, size 143 at 0x00C94FC
0>), (1276937973.333725, <read-only buffer ptr 0x00F7004A, size 113 at 0x00C94FE
0>), (1276937973.3350179, <read-only buffer ptr 0x00F7004A, size 120 at 0x00C990
20>), (1276937973.338382, <read-only buffer ptr 0x00F7004A, size 123 at 0x00C990
40>)]

What version of the product are you using? On what operating system?
python 2.6 and the lastest pypcap

Please provide any additional information below.

Original issue reported on code.google.com by xueyaos...@gmail.com on 21 Jun 2010 at 3:31

GoogleCodeExporter commented 9 years ago
I read the pyx file,and find that if we get a copy of the pkt,maybe the issue 
will gone. the dispatch always return the same address, we should always copy 
the data immediatly,or the next pkt will overlay the prev.

    def __add_pkts(self, ts, pkt, pkts):
        #should we make a copy of pkt?
        pkts.append((ts, pkt))

    def readpkts(self):
        """Return a list of (timestamp, packet) tuples received in one buffer."""
        pkts = []
        self.dispatch(-1, self.__add_pkts, pkts)
        return pkts

Original comment by xueyaos...@gmail.com on 21 Jun 2010 at 3:35

GoogleCodeExporter commented 9 years ago

Original comment by getxs...@gmail.com on 16 Jul 2010 at 7:32

GoogleCodeExporter commented 9 years ago
Apparently, the problem is caused by incorrect usage of PyBuffer_FromMemory 
function. The documentation reads:

=doc
PyObject* PyBuffer_FromMemory(void *ptr, Py_ssize_t size)¶

Return a new read-only buffer object that reads from a specified location in 
memory, with a specified size. The caller is responsible for ensuring that the 
memory buffer, passed in as ptr, is not deallocated while the returned buffer 
object exists.
=cut

The created buffer uses the memory area pointed by 'ptr' directly - however, 
this memory belongs to libpcap and we have no guarantee that it won't get 
overwritten, dealocated, etc.

A simple and efficient approach is to use the PyString_FromStringAndSize 
instead. This function makes its own copy of the memory pointed by 'ptr' and 
manages that copy, freeing the memory when the string object is destroyed.

This patch changes the behavior of library in that instead of returning 
'buffer' objects, it return 'str' objects. However, I don't see this to be a 
problem; my rationale is that:

1. Strings implement buffer interface, so they should be compatible.
2. Buffer interfaces are deprecated (replaced by memoryview in Python 3.0).
3. We are not using any special features of the buffer interface.
4. It's obvious for a user how to handle a string, while using buffers requires 
at least a little glimpse at the examples or documentation to figure out how 
they work.

I've also attached a test pcap file (ARP request+reply) and script to test the 
behavior. Without the patch, the program works as follows:

$ ./buf.py
[(1279381079.55707, <read-only buffer ptr 0x97d6c7a, ...),
 (1279381079.577106, <read-only buffer ptr 0x97d6c7a, ...)]

whereas applying the patch yields:

$ ./buf.py
[(1279381235.636596,  '\xff\xff\xff\xff\xff\xff'...),
 (1279381235.6368649, '\x00\x11\t\x81\xec\xdc'...)]

(Note: output truncated for clarity.)

Original comment by ko...@monitech.pl on 17 Jul 2010 at 3:48

Attachments:

rgom commented 7 years ago

Just spotted exactly this issue - iterated over pcap.pcap while copying each packet, which in effect copied only one reference several times. I'm not sure if this is still applicable, but I'd rather not use string interface for binary data. I'd rather use bytes (eventually bytearray) there. That way it would be more clean and consistent with python 3 behaviour. Is there a corresponding API for that?

Robert

rgom commented 7 years ago

Ok, it seems for Python 2 this is valid approach ... "

Note

These functions have been renamed to PyBytes* in Python 3.x. Unless otherwise noted, the PyBytes functions available in 3.x are aliased to their PyString* equivalents to help porting. "