dalibo / pgshark

Messing with PostgreSQL network traffic to make some usefull things
http://dalibo.github.io/pgshark
Other
85 stars 11 forks source link

pgs-replay won't parse pcap file #1

Open mrimbault opened 11 years ago

mrimbault commented 11 years ago

Hello !

I want to replay messages on a local PostgreSQL cluster from a previously generated pcap file, but it seems that the file is not parsed at all.

The pcap file is generated with tcpdump, capturing traffic from a local 9.1 PostgreSQL cluster The command used to generate the pcap file :

tcpdump -i lo port 5433 -w /home/postgres_data/tcpdump

Then I generate some writes with pgbench, and interrupt tcpdump when it's done :

pgbench -h 127.0.0.1 -p 5433 -i -F 100 -s 2 pgbench

At this point I want to replay the messages on a 9.2 local cluster. Thus, the command used is :

pgs-replay --rhost 127.0.0.1 --rport 5437 --role postgres --read /home/postgres_data/tcpdump

pg_hba acces mode is "trust" for localhost. The pgbench database exists on both clusters.

No data is inserted on the 9.2 cluster, and nothing appears on the PostgreSQL logs either (log_min_duration_statement set to 0).

Doing some perl debugging, I can see that everything seems going well until the call for "process_all" subroutine. At this point, I suppose that a call should be done to the "Replay" subroutine, but it doesn't happen.

The end of the debug trace :

339:    $shark->process_all();
pgShark::process_all(/usr/local/share/perl/5.10.1/pgShark.pm:355):
355:        my $self = shift;
pgShark::process_all(/usr/local/share/perl/5.10.1/pgShark.pm:360):
360:        dprint 2, "PCAP: start to loop over captured packet." if DEBUG;
pgShark::process_all(/usr/local/share/perl/5.10.1/pgShark.pm:363):
363:        Net::Pcap::pcap_loop( $pcaps{ $self->{'id'} },
364:            -1, \&process_packet, $self )
365:            if exists $pcaps{ $self->{'id'} };
main::(/usr/local/bin/pgs-replay:341):
341:    $shark->close();
pgShark::close(/usr/local/share/perl/5.10.1/pgShark.pm:307):
307:        my $self = shift;
pgShark::close(/usr/local/share/perl/5.10.1/pgShark.pm:309):
309:        Net::Pcap::pcap_close( $pcaps{ $self->{'id'} } )
310:            if exists $pcaps{ $self->{'id'} };
pgShark::close(/usr/local/share/perl/5.10.1/pgShark.pm:312):
312:        dprint 2, "PCAP: pcap stream %u closed.", $self->{'id'} if DEBUG;
pgShark::close(/usr/local/share/perl/5.10.1/pgShark.pm:314):
314:        delete $pcaps{ $self->{'id'} };
main::(/usr/local/bin/pgs-replay:343):
343:    exit 0;
pgShark::DESTROY(/usr/local/share/perl/5.10.1/pgShark.pm:847):
847:        my $self = shift;
pgShark::DESTROY(/usr/local/share/perl/5.10.1/pgShark.pm:849):
849:        if ( exists $pcaps{ $self->{'id'} } ) {
pgShark::DESTROY(/usr/local/share/perl/5.10.1/pgShark.pm:853):
853:        dprint 1, "pgShark: Total number of messages processed: %d.",
854:            $self->{'msg_count'}
855:            if DEBUG;

I also checked for the number of messages processed :

  DB<3> s
pgShark::DESTROY(/usr/local/share/perl/5.10.1/pgShark.pm:853):
853:        dprint 1, "pgShark: Total number of messages processed: %d.",
854:            $self->{'msg_count'}
855:            if DEBUG;

 DB<3> p $self->{'msg_count'}
0

I suppose I may be doing something wrong, but I can't figure what it is. I can upload the whole debug trace if that can be of any help, or any configuration file you may need.

Thanks !

Mael

ioguix commented 11 years ago

Hi,

You might be able to see debug messages if you start pgs-repay with "-d" (multiple 'd' to increase debug messages):

DEBUG=1 pgs-replay -d --rhost 127.0.0.1 --rport 5437 --role postgres --read /home/postgres_data/tcpdump

I think pgShark core filters out everything because it look for messages from the default host:port which is localhost:5432. In debug level 1 or 2, you might be able to see the pcap filter used.

Try with:

DEBUG=1 pgs-replay -d --rhost 127.0.0.1 --rport 5437 --role postgres --read /home/postgres_data/tcpdump --host 127.0.0.1 --port 5433

Cheers,

mrimbault commented 11 years ago

Hi,

Thanks for the quick answer !

I had no luck with the debug mode of pgs-replay, it just doesn't print anything on stdout. Thus I checked for the port filter from the perl debugger, and found that it indeed defaults to 5432:

pgShark::_setFilter(/usr/local/share/perl/5.10.1/pgShark.pm:231):
231:        my $filter   = sprintf PCAP_FILTER_TEMPLATE, $self->{'port'};
  DB<6> s
pgShark::_setFilter(/usr/local/share/perl/5.10.1/pgShark.pm:233):
233:        dprint 2, "PCAP: set filter to: %s", $filter if DEBUG;
  DB<6> p $filter
(tcp and port 5432) and ( (((ip[2:2] - ((ip[0]&0xf)<<2)) - ((tcp[12]&0xf0)>>2)) != 0) or (tcp[tcpflags] & (tcp-fin|tcp-rst) != 0) )

With explicit host and port, I now do have the correct port:

pgShark::_setFilter(/usr/local/share/perl/5.10.1/pgShark.pm:231):
231:        my $filter   = sprintf PCAP_FILTER_TEMPLATE, $self->{'port'};
  DB<2> s
pgShark::_setFilter(/usr/local/share/perl/5.10.1/pgShark.pm:233):
233:        dprint 2, "PCAP: set filter to: %s", $filter if DEBUG;
  DB<2> p $filter
(tcp and port 5433) and ( (((ip[2:2] - ((ip[0]&0xf)<<2)) - ((tcp[12]&0xf0)>>2)) != 0) or (tcp[tcpflags] & (tcp-fin|tcp-rst) != 0) )

I tested that filter on the pcap file with tcpdump -r , and it seems to be catching the messages now, so it was a good call.

Sadly, the messages are still not being replayed against the cluster, even with the correct filter, so there may be another problem.

Mael

mrimbault commented 11 years ago

Hi !

It seems that my current problem is due to the SSL encryption of the network messages. When I use pgs-debug, I have the following output:

/usr/local/bin/pgs-debug --read /home/postgres_data/tcpdump-2.out --host 127.0.0.1 --port 5433
Packet: t=1360833372.702776, session=213070643355785
PGSQL: type=SSLRequest, F -> B
SSL REQUEST

Is there any built-in workaround, like a way to provide pgShark with the ssl key ?

Mael

ioguix commented 11 years ago

Sorry, I hadn't time so far to answer here and make a demo...

SSLRequest means the frontend ask the backend if it accepts to use SSL. Please, make sure you have the latest available snapshot of pgShark from the master branch on github, and check the backend answer (Y == yes, N == no).

If the connection is encrypted, pgShark is not able to decrypt it presently and ignore all the SSL encrypted TCP dialog. Debug messages show this explicitly.

I'm still wondering why you cannot make debug messages work :/

mrimbault commented 11 years ago

Well, the debug mode seems to be working properly as soon as pgs-replay finds some messages to be processed.

For example, when I read from a non SSL pcap file, I have the following output:

postgres@mrimbault - PG 9.2 master ~
-> DEBUG=6 /usr/local/bin/pgs-replay -ddd --rhost 127.0.0.1 --rport 5437 --role postgres --read /home/postgres_data/tcpdump-3.out --host 127.0.0.1 --port 5433
StartupMessage: session 213070643333582. at /usr/local/bin/pgs-replay line 147.
openSocket: session 213070643333582 at /usr/local/bin/pgs-replay line 158.
  opening inet socket : «tcp://127.0.0.1:5437» at /usr/local/bin/pgs-replay line 173.
  socket for session 213070643333582. at /usr/local/bin/pgs-replay line 189.
session 213070643333582
readFromBackend: on session 213070643333582. at /usr/local/bin/pgs-replay line 51.
  324 bytes readed. at /usr/local/bin/pgs-replay line 72.
Seen a AuthenticationOk as answer to 213070643333582. at /usr/local/bin/pgs-replay line 105.
Seen a ParameterStatus as answer to 213070643333582. at /usr/local/bin/pgs-replay line 105.
Seen a ParameterStatus as answer to 213070643333582. at /usr/local/bin/pgs-replay line 105.
Seen a ParameterStatus as answer to 213070643333582. at /usr/local/bin/pgs-replay line 105.
Seen a ParameterStatus as answer to 213070643333582. at /usr/local/bin/pgs-replay line 105.
Seen a ParameterStatus as answer to 213070643333582. at /usr/local/bin/pgs-replay line 105.
Seen a ParameterStatus as answer to 213070643333582. at /usr/local/bin/pgs-replay line 105.
Seen a ParameterStatus as answer to 213070643333582. at /usr/local/bin/pgs-replay line 105.
Seen a ParameterStatus as answer to 213070643333582. at /usr/local/bin/pgs-replay line 105.
Seen a ParameterStatus as answer to 213070643333582. at /usr/local/bin/pgs-replay line 105.
Seen a ParameterStatus as answer to 213070643333582. at /usr/local/bin/pgs-replay line 105.
Seen a ParameterStatus as answer to 213070643333582. at /usr/local/bin/pgs-replay line 105.
Seen a BackendKeyData as answer to 213070643333582. at /usr/local/bin/pgs-replay line 105.
Seen a ReadyForQuery as answer to 213070643333582. at /usr/local/bin/pgs-replay line 105.
Replay: replaying a 'Query' for session 213070643333582. at /usr/local/bin/pgs-replay line 222.
Terminate: session 213070643333582. at /usr/local/bin/pgs-replay line 249.

And in this case, the replay works as well. Thus my issue was due to:

Now I just have to figure how to deal with the SSL encrypted pcap file. Thanks for your replies !

Mael

ioguix commented 11 years ago

Mh, weird, these debug messages comes from pgs-replay, I was expecting debug messages from pgShark module itself. I really need to have a closer look on pgs-replay, but I'm pretty sure it was in a good shape with my latest patch. Make sure to update your pgshark sources from github.

Sadly, I hadn't time to mess with SSL decryption and I guess it will kill performance a lot, forbidding any "live" processing for now. I am more concerned about capture/processing performances presently to loose as few packets as possible (thus, my pending patch on Net::Pcap).

cheers,

mrimbault commented 11 years ago

Weird indeed, as it seems that I have the latest available version:

$ git log -1 --pretty="format:%h %cd"
b56de7a Wed Jan 30 12:39:16 2013 +0100

Mael

ioguix commented 11 years ago

I'll try to find some time to make a demo.