sipcapture / heplify-server

HEP Capture Server for HOMER
https://sipcapture.org
GNU Affero General Public License v3.0
183 stars 85 forks source link

Tables partitioned incorrectly #28

Closed nevian427 closed 6 years ago

nevian427 commented 6 years ago

Hi,

Table rotations done by heplify-server sets incorrect time(local timezone?), e.g. p20180404_pnr0 1522789200 (Tue, 03 Apr 2018 21:00:00 GMT) must be 1522886400 (Thu, 05 Apr 2018 00:00:00 GMT)

And on day 2018-04-04 it can't write as driver error (1526): Table has no partition for value 1522886218

negbie commented 6 years ago

Hi, will look into it.

negbie commented 6 years ago

Changed to UTC for rotation https://github.com/sipcapture/heplify-server/commit/dc607a0f6dd63367621f19a13c8452265381da9b

I uploaded a new binary under releases. Could you please test it with -loglvl debug

nevian427 commented 6 years ago

Hi, Nothing changed p20180412_pnr0 1523480400

negbie commented 6 years ago

Please give -loglvl debug output from at least one whole table partition

nevian427 commented 6 years ago

2018-04-10T12:51:43+03:00 DBG [rotator] db query: CREATE TABLE IF NOT EXISTS logs_capture_all_20180410 ( id bigint(20) unsigned NOT NULL AUTO_INCREMENT, date timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP, micro_ts bigint(18) NOT NULL DEFAULT '0', correlation_id varchar(256) NOT NULL DEFAULT '', source_ip varchar(60) NOT NULL DEFAULT '', source_port int(10) NOT NULL DEFAULT 0, destination_ip varchar(60) NOT NULL DEFAULT '', destination_port int(10) NOT NULL DEFAULT 0, proto int(5) NOT NULL DEFAULT 0, family int(1) DEFAULT NULL, type int(5) NOT NULL DEFAULT 0, node varchar(125) NOT NULL DEFAULT '', msg varchar(1500) NOT NULL DEFAULT '', PRIMARY KEY (id,date), KEY date (date), KEY correlationid (correlation_id(255)) ) ENGINE=InnoDB DEFAULT CHARSET=utf8 ROW_FORMAT=COMPRESSED KEY_BLOCK_SIZE=8 PARTITION BY RANGE ( UNIX_TIMESTAMP(date) ) ( PARTITION p20180410_pnr0 VALUES LESS THAN ( UNIX_TIMESTAMP('2018-04-10 00:00:00') ) );

2018-04-10T12:51:43+03:00 DBG [rotator] db query: ALTER TABLE logs_capture_all_20180410 ADD PARTITION (PARTITION p20180410_pnr0 VALUES LESS THAN ( UNIX_TIMESTAMP('2018-04-10 00:00:00') ));

negbie commented 6 years ago

Well from this output I see nothing strange. The table name suffix is from today 20180410 The first partition is for this day at midnight PARTITION p20180410_pnr0 VALUES LESS THAN ( UNIX_TIMESTAMP('2018-04-10 00:00:00') )

So far all good. So I need more lines!

nevian427 commented 6 years ago

IMO there should be next day, isn't it? And why result again timezone affected? Which lines do you need?

negbie commented 6 years ago

If you look more closely you will find the next day too. Again you provide no debug logs where I see something strange in the timezone.

nevian427 commented 6 years ago

I mean for day 10 apr we should have PARTITION p20180410_pnr0 VALUES LESS THAN ( UNIX_TIMESTAMP('2018-04-11 00:00:00') )

Old good homer_mysql_rotate.pl created like this

create data table: CREATE TABLE IF NOT EXISTS webrtc_capture_all_20180410 ( id bigint(20) unsigned NOT NULL AUTO_INCREMENT, date timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP,
micro_ts bigint(18) NOT NULL DEFAULT '0', method varchar(100) NOT NULL DEFAULT '', caller varchar(250) NOT NULL DEFAULT '', callee varchar(250) NOT NULL DEFAULT '', session_id varchar(256) NOT NULL DEFAULT '', correlation_id varchar(256) NOT NULL DEFAULT '', source_ip varchar(60) NOT NULL DEFAULT '', source_port int(10) NOT NULL DEFAULT 0, destination_ip varchar(60) NOT NULL DEFAULT '', destination_port int(10) NOT NULL DEFAULT 0, proto int(5) NOT NULL DEFAULT 0, family int(1) DEFAULT NULL, type int(5) NOT NULL DEFAULT 0, node varchar(125) NOT NULL DEFAULT '', msg varchar(3000) NOT NULL DEFAULT '',
PRIMARY KEY (id,date), KEY date (date), KEY sessionid (session_id), KEY correlationid (correlation_id) ) ENGINE=InnoDB DEFAULT CHARSET=utf8 ROW_FORMAT=COMPRESSED KEY_BLOCK_SIZE=8 COMMENT='20180410' /!50100 PARTITION BY RANGE ( UNIX_TIMESTAMP(date)) (PARTITION p2018041000 VALUES LESS THAN (1523404800), PARTITION pmax VALUES LESS THAN MAXVALUE ENGINE = InnoDB) / ;

negbie commented 6 years ago

When you start heplify-server with enabled rotation it will call once createTables() https://github.com/sipcapture/heplify-server/blob/master/database/rotator.go#L216

So for example when you start heplify-server today(20180410) createTables() will create 20180410 20180411

When createTables() has finished a cronjob will be created which will execute every night at 03:15 r.CreateDataTables(1) r.CreateDataTables(2)

So on 20180411 at 03:15 a table 20180412 and 20180413 will be created.

nevian427 commented 6 years ago

I see.. Partition should end on NEXT midnight so for 10 apr we have to break at 00:00 11 apr

negbie commented 6 years ago

When you look more closely you will see I create PARTITION pmax too so this will handle it.

games130 commented 6 years ago

The rotator seems to create the partition value late by 1 hour. p20180420_pnr0 should be recording time less than 0100h p20180420_pnr1 should be recording time less than 0200h

but from the created partition, it is doing p20180420_pnr0 should be recording time less than 0000h p20180420_pnr1 should be recording time less than 0100h

Also, Just wondering this partition is created for homer5 or homer7? From what i understand homer5 store everything based on UTC time.. but the rotation is stored based on my server time which is GMT+8.

So the sql command send by HOMER5 would have problem finding some of the calls during specific time. like 19-04-2018 0AM-2AM GMT+8 as it would be trying to find in the file sip_capture_call_20180418 because of the UTC date.

This is the partition created by the rotator:

) ENGINE=InnoDB DEFAULT CHARSET=utf8 ROW_FORMAT=COMPRESSED KEY_BLOCK_SIZE=8 /*!50100 PARTITION BY RANGE ( UNIX_TIMESTAMP(date)) (PARTITION p20180420_pnr0 VALUES LESS THAN (1524153600) ENGINE = InnoDB, PARTITION p20180420_pnr1 VALUES LESS THAN (1524157200) ENGINE = InnoDB, PARTITION p20180420_pnr2 VALUES LESS THAN (1524160800) ENGINE = InnoDB, PARTITION p20180420_pnr3 VALUES LESS THAN (1524164400) ENGINE = InnoDB, PARTITION p20180420_pnr4 VALUES LESS THAN (1524168000) ENGINE = InnoDB, PARTITION p20180420_pnr5 VALUES LESS THAN (1524171600) ENGINE = InnoDB, PARTITION p20180420_pnr6 VALUES LESS THAN (1524175200) ENGINE = InnoDB, PARTITION p20180420_pnr7 VALUES LESS THAN (1524178800) ENGINE = InnoDB, PARTITION p20180420_pnr8 VALUES LESS THAN (1524182400) ENGINE = InnoDB, PARTITION p20180420_pnr9 VALUES LESS THAN (1524186000) ENGINE = InnoDB, PARTITION p20180420_pnr10 VALUES LESS THAN (1524189600) ENGINE = InnoDB, PARTITION p20180420_pnr11 VALUES LESS THAN (1524193200) ENGINE = InnoDB, PARTITION p20180420_pnr12 VALUES LESS THAN (1524196800) ENGINE = InnoDB, PARTITION p20180420_pnr13 VALUES LESS THAN (1524200400) ENGINE = InnoDB, PARTITION p20180420_pnr14 VALUES LESS THAN (1524204000) ENGINE = InnoDB, PARTITION p20180420_pnr15 VALUES LESS THAN (1524207600) ENGINE = InnoDB, PARTITION p20180420_pnr16 VALUES LESS THAN (1524211200) ENGINE = InnoDB, PARTITION p20180420_pnr17 VALUES LESS THAN (1524214800) ENGINE = InnoDB, PARTITION p20180420_pnr18 VALUES LESS THAN (1524218400) ENGINE = InnoDB, PARTITION p20180420_pnr19 VALUES LESS THAN (1524222000) ENGINE = InnoDB, PARTITION p20180420_pnr20 VALUES LESS THAN (1524225600) ENGINE = InnoDB, PARTITION p20180420_pnr21 VALUES LESS THAN (1524229200) ENGINE = InnoDB, PARTITION p20180420_pnr22 VALUES LESS THAN (1524232800) ENGINE = InnoDB, PARTITION p20180420_pnr23 VALUES LESS THAN (1524236400) ENGINE = InnoDB,

games130 commented 6 years ago

This is rotation done by homer5 rotation script

) ENGINE=InnoDB AUTO_INCREMENT=13060828 DEFAULT CHARSET=utf8 ROW_FORMAT=COMPRESSED KEY_BLOCK_SIZE=8 COMMENT='20180417' /*!50100 PARTITION BY RANGE ( UNIX_TIMESTAMP(date)) (PARTITION p2018041700 VALUES LESS THAN (1523926800) ENGINE = InnoDB, PARTITION p2018041701 VALUES LESS THAN (1523930400) ENGINE = InnoDB, PARTITION p2018041702 VALUES LESS THAN (1523934000) ENGINE = InnoDB, PARTITION p2018041703 VALUES LESS THAN (1523937600) ENGINE = InnoDB, PARTITION p2018041704 VALUES LESS THAN (1523941200) ENGINE = InnoDB, PARTITION p2018041705 VALUES LESS THAN (1523944800) ENGINE = InnoDB, PARTITION p2018041706 VALUES LESS THAN (1523948400) ENGINE = InnoDB, PARTITION p2018041707 VALUES LESS THAN (1523952000) ENGINE = InnoDB, PARTITION p2018041708 VALUES LESS THAN (1523955600) ENGINE = InnoDB, PARTITION p2018041709 VALUES LESS THAN (1523959200) ENGINE = InnoDB, PARTITION p2018041710 VALUES LESS THAN (1523962800) ENGINE = InnoDB, PARTITION p2018041711 VALUES LESS THAN (1523966400) ENGINE = InnoDB, PARTITION p2018041712 VALUES LESS THAN (1523970000) ENGINE = InnoDB, PARTITION p2018041713 VALUES LESS THAN (1523973600) ENGINE = InnoDB, PARTITION p2018041714 VALUES LESS THAN (1523977200) ENGINE = InnoDB, PARTITION p2018041715 VALUES LESS THAN (1523980800) ENGINE = InnoDB, PARTITION p2018041716 VALUES LESS THAN (1523984400) ENGINE = InnoDB, PARTITION p2018041717 VALUES LESS THAN (1523988000) ENGINE = InnoDB, PARTITION p2018041718 VALUES LESS THAN (1523991600) ENGINE = InnoDB, PARTITION p2018041719 VALUES LESS THAN (1523995200) ENGINE = InnoDB, PARTITION p2018041720 VALUES LESS THAN (1523998800) ENGINE = InnoDB, PARTITION p2018041721 VALUES LESS THAN (1524002400) ENGINE = InnoDB,

games130 commented 6 years ago

I tried to improve on the rotation script, please do help check #37

negbie commented 6 years ago

@nevian427 @games130 could you please test with latest master?

games130 commented 6 years ago

@negbie If compared with the original homer, it is still not working well. I tested with the latest source.

Problems:

  1. My heplify-server.toml was set to 1h partition, but it created the new data with 2h partition
  2. Value is also not correct,
    • 20180426_pnr0 should be VALUES LESS THAN (1524708000)
    • 20180426_pnr1 should be VALUES LESS THAN (1524715200)

I will update you again tomorrow to see how the rotation goes.

This is the result(same with the other tables):

/!50100 PARTITION BY RANGE ( UNIX_TIMESTAMP(date)) (PARTITION 20180426_pnr0 VALUES LESS THAN (1524700800) ENGINE = InnoDB, PARTITION 20180426_pnr1 VALUES LESS THAN (1524708000) ENGINE = InnoDB, PARTITION 20180426_pnr2 VALUES LESS THAN (1524715200) ENGINE = InnoDB, PARTITION 20180426_pnr3 VALUES LESS THAN (1524722400) ENGINE = InnoDB, PARTITION 20180426_pnr4 VALUES LESS THAN (1524729600) ENGINE = InnoDB, PARTITION 20180426_pnr5 VALUES LESS THAN (1524736800) ENGINE = InnoDB, PARTITION 20180426_pnr6 VALUES LESS THAN (1524744000) ENGINE = InnoDB, PARTITION 20180426_pnr7 VALUES LESS THAN (1524751200) ENGINE = InnoDB, PARTITION 20180426_pnr8 VALUES LESS THAN (1524758400) ENGINE = InnoDB, PARTITION 20180426_pnr9 VALUES LESS THAN (1524765600) ENGINE = InnoDB, PARTITION 20180426_pnr10 VALUES LESS THAN (1524772800) ENGINE = InnoDB, PARTITION 20180426_pnr11 VALUES LESS THAN (1524780000) ENGINE = InnoDB, PARTITION pmax VALUES LESS THAN MAXVALUE ENGINE = InnoDB) / |

negbie commented 6 years ago

Let me check

games130 commented 6 years ago

oops my bad on the first problem. I did not notice the toml file format have changed, was using the old file format. (default to the 2h partition when it cannot find the right parameter)

I tested with the new .toml file format and the 1h partition is working (for new create) but the timing is still wrong.

negbie commented 6 years ago

No problem. We need to discuss the timing. I still don't think its a problem with the first partition because the whole table is day based

negbie commented 6 years ago

I will spin up a container set the time to midnight and test it

games130 commented 6 years ago

lets say the partition is 1h. My understanding is that: pnr0 should store 00:00 - 01:00 pnr1 should store 01:00 - 02:00

Is that what you are trying to achieve as well?

negbie commented 6 years ago

What I do in mysql is

pnr0 less than 00:00 pnr1 less than 01:00

pnr0 is more or less a placeholder because less 00:00 would mean yesterday but since the table is based on daily timestamp it should go into the pmax partition from yesterday

negbie commented 6 years ago

For postgres I do exactly that what you describe but not for mysql

negbie commented 6 years ago

If you think I am wrong tell me ;)

games130 commented 6 years ago

Yes, i do hope mysql can be same with what you have done for postgres. well i am basing this on the original homer5 (with kamailio) script which generate the time that way.

games130 commented 6 years ago

one more question, the time 1524700800 and 1524708000 is suppose to be my local time (in my timezone) or should be the UTC time

(PARTITION 20180426_pnr0 VALUES LESS THAN (1524700800) ENGINE = InnoDB, PARTITION 20180426_pnr1 VALUES LESS THAN (1524708000) ENGINE = InnoDB,

Want to know based on your design, it should be UTC or localtimezone time

negbie commented 6 years ago

This should be 00:00 utc time and +x localtime. Check it here:

https://www.unixtimestamp.com/index.php

negbie commented 6 years ago

I made some tests with following simple dataset on a 2h partitioned table:

INSERT INTO sip_capture_rest_20180426(date,micro_ts,method) VALUES ('2018-04-26 21:59:59.999999', 1524772799999999, '200'); INSERT INTO sip_capture_rest_20180426(date,micro_ts,method) VALUES ('2018-04-26 22:59:59.999999', 1524776399999999, '200'); INSERT INTO sip_capture_rest_20180426(date,micro_ts,method) VALUES ('2018-04-26 23:59:59.999999', 1524779999999999, '200'); INSERT INTO sip_capture_rest_20180427(date,micro_ts,method) VALUES ('2018-04-27 00:00:01.111111', 1524780001111111, '200'); INSERT INTO sip_capture_rest_20180427(date,micro_ts,method) VALUES ('2018-04-27 01:00:01.111111', 1524783601111111, '200'); INSERT INTO sip_capture_rest_20180427(date,micro_ts,method) VALUES ('2018-04-27 02:00:01.111111', 1524787201111111, '200');

Then I checked the partitions

SELECT table_rows as 'count()' FROM information_schema.partitions WHERE table_schema = schema() and table_name ='sip_capture_rest_20180426' and partition_name = '20180426_pnr11'; ................................ count() 1

SELECT table_rows as 'count()' FROM information_schema.partitions WHERE table_schema = schema() and table_name ='sip_capture_rest_20180426' and partition_name = 'pmax'; ................................ count() 2

SELECT table_rows as 'count()' FROM information_schema.partitions WHERE table_schema = schema() and table_name ='sip_capture_rest_20180427' and partition_name = '20180427_pnr0'; ................................ count() 0

SELECT table_rows as 'count()' FROM information_schema.partitions WHERE table_schema = schema() and table_name ='sip_capture_rest_20180427' and partition_name = '20180427_pnr1'; ................................ count() 2

SELECT table_rows as 'count()' FROM information_schema.partitions WHERE table_schema = schema() and table_name ='sip_capture_rest_20180427' and partition_name = '20180427_pnr2'; ................................ count() 1

So as I said partition pnr0 won't be used and can be ignored. So there is nothing to do for me here.

games130 commented 6 years ago

@negbie Need to double confirm again on the partition time hahahahaha.

for a server in UTC (GMT+0), the table created will be for that day (example today 20180427). the first pnr0 will be UTC 00:00-01:00

for a server in GMT+8, the first pnr0 will be GMT+8 00:00-01:00 -- or should it be pnr0 UTC 00:00-01:00

negbie commented 6 years ago

It should be fine now. The important part is to use the right format (Format("2006-01-02 15:04:05-07:00")) which will send the timezone too.

With that mysql will use the right timezone. Otherwise it would assume its UTC.

Right now this is the pnr0 partition for my timezone CEST (UTC+2)

ALTER TABLE sip_capture_call_20180428 ADD PARTITION (PARTITION 20180428_pnr0 VALUES LESS THAN ( UNIX_TIMESTAMP('2018-04-28 00:00:00+02:00') ));

In unixtime 1524787200 Which is 27.04.2018 - 02:00:00 and the correct value for 2h based partition.

games130 commented 6 years ago

Okay what I notice using homer5 GUI is this. INSERT INTO sip_capture_rest_20180427(date,micro_ts,method) VALUES ('2018-04-27 00:30:01.111111', 1524780001111111, '200');

After you insert this data, use the homer GUI search for "today". You will be able to find the data.

But if you search for time 2018-04-27 00:00 to 2018-04-27 01:00. You will not be able to find it.

negbie commented 6 years ago

Let me spin up the mysql container to test this.

negbie commented 6 years ago

Do you use a dedicated GUI and the latest heplify-server version or from the docker container?

games130 commented 6 years ago

Homer webui and MySQL is docker (from your folders)

Heplify-server is the latest

games130 commented 6 years ago

@negbie Are you getting the same result as i describe?

negbie commented 6 years ago

@games130 sorry couldn't test it yet because of some other stuff I am working on. Will give you feedback asap.

negbie commented 6 years ago

@games130 the only things I noticed are that the packr file needs to be updated and the wrong unixtimestamp in your example insert statement

INSERT INTO sip_capture_rest_20180427(date,micro_ts,method) VALUES ('2018-04-27 00:30:01.111111', 1524780001111111, '200');

please use 1524789001111111

check 1524789001 here: https://www.unixtimestamp.com/index.php

Whith that I was able to find the 200 inside the timerange 2018-04-27 00:00 to 2018-04-27 01:00

negbie commented 6 years ago

Its best to test with real data. You can use heplify with the pcap from the example folder. Just run from the heplify folder:

./heplify -rf example/rtp_rtcp_sip.pcap

games130 commented 6 years ago

oops mistake when modifying.. you found the data in mysql or found it from the homer WebUI search? because the data will be inserted correctly in mysql, just cannot be search from homer webUI

negbie commented 6 years ago

I build a new graf-homer-hep-prom cointainer and tested it with ./heplify -rf example/rtp_rtcp_sip.pcap

I checked the call table in db

1 2018-04-27 12:58:57 1524833937690554 INVITE

Keep in mind its UTC. In the GUI I used GMT+2 because its my localtime and was able to find that INVITE with the range

From 2018-04-27 14:55:00 to 2018-04-27 15:00:00, TZ GMT+2 EET

games130 commented 6 years ago

please do correct me if i am wrong, my understanding is that (This is what i remember, i dont have access to my server now as i have gone back for the weekend):

  1. The current heplify-server will create the tables and partition based on server timzone --> converted to UTC. So in my case GMT+8

    table sip_capture_call_20180427 
    pnr0 will be UNIX_TIMESTAMP('2018-04-27 01:00:00+08:00')
    pnr1 will be UNIX_TIMESTAMP('2018-04-27 02:00:00+08:00')
    pnr2 will be UNIX_TIMESTAMP('2018-04-27 03:00:00+08:00')
  2. Homer WebUI will do the sql query after converting the time selected to UTC time. If i search for 2017-04-27 00:00-02:00 GMT+8, After convert to UTC:

    2017-04-27 00:00 GMT +8 = 2017-04-26 16:00 UTC
    2017-04-27 02:00 GMT +8 = 2017-04-26 18:00 UTC

    So because of the new UTC time, it will query to table sip_capture_call_20180426 date between 2017-04-26 16:00 and 2017-04-26 18:00

So actually the query has been send to the wrong table.

games130 commented 6 years ago

It is strange in your database the time stored is UTC, how did that happen? Based on the homer5.go the timestamp does not have any timezone information, so your mysql will stored based on mysql server timezone.

https://github.com/sipcapture/heplify-server/blob/01174edd961477e4397b534bd7492349c842b44a/database/sqlhomer5.go#L175

negbie commented 6 years ago

When you provide no timezone information mysql will use UTC not localtime

games130 commented 6 years ago

hmmm...okay let me check again on monday.

negbie commented 6 years ago

What I can suggest you is to change it back to

ts = pkt.Timestamp.UTC().Format("2006-01-02 15:04:05.999999")

This should not break anything

negbie commented 6 years ago

Will need to check how postgres is handling this.

games130 commented 6 years ago

could you help me check a few things on your server, so that first thing monday when i get back i can compare to what i have.

  1. Your exact time when executing the below with timezone
  2. After you have loaded latest live data, (i want to know what are the date and time you see here) select date from sip_capture_call_20180427 order by date DESC limit 10;
  3. show create table sip_capture_call_20180427

Why i am asking this is because, i recall that when i did number 2, the result i get from date is the current time. I dont see the UTC time.

thank you

negbie commented 6 years ago

MariaDB [homer_data]> select now(); +---------------------+ | now() | +---------------------+ | 2018-04-27 17:30:24 | +---------------------+ 1 row in set (0.00 sec)

insert will produce id: 16493868 date: 2018-04-27 17:30:35 micro_ts: 1524843035285325

Sorry its not per default utc it depends on your db configuration. As you can see select now() is the same like the ntptime ntptime Fri, Apr 27 2018 17:31:55.498, (.498500),

And the last partitions are:

PARTITION 20180428_pnr140 VALUES LESS THAN (1524951000) ENGINE = InnoDB, PARTITION 20180428_pnr141 VALUES LESS THAN (1524951600) ENGINE = InnoDB, PARTITION 20180428_pnr142 VALUES LESS THAN (1524952200) ENGINE = InnoDB, PARTITION 20180428_pnr143 VALUES LESS THAN (1524952800) ENGINE = InnoDB, PARTITION pmax VALUES LESS THAN MAXVALUE ENGINE = InnoDB)

Last partition timestamp 1524952800 is in my timezone 29.04.2018 - 00:00:00

Edit: fixed copy paste error

negbie commented 6 years ago

In the GUI you will see following under details:

id | 16493868 date | 2018-04-27 15:30:35 milli_ts | 1524843035285 micro_ts | 1524843035285325