Closed Rameen-Mahmood closed 6 months ago
Looks good. To be extra safe, make sure that the timestamps are sorted. Here's a toy example:
import pandas as pd
df = [
('a', 4),
('b', 20),
('a', 2),
('a', 1),
('b', 10),
('b', 100)
]
df = pd.DataFrame(df, columns=['Packet', 'Time']).sort_values(by=['Packet', 'Time'])
g = df.groupby('Packet')
df['Inter-Arrival-Time'] = g['Time'].diff()
df
@crazyideas21 Could you review the method for calculating inter-arrival times between network packets? I'm using
diff()
on the timestamp column to compute the difference between each packet's timestamp and the previous packet's timestampgrouped = df.groupby(['ip.src', 'ip.dst', 'tcp.srcport', 'tcp.dstport', 'udp.srcport', 'udp.dstport', '_ws.col.Protocol'])
df['inter_arrival_time'] = df.groupby(['ip.src', 'ip.dst', 'tcp.srcport', 'tcp.dstport', 'udp.srcport', 'udp.dstport', '_ws.col.Protocol'])['frame.time_epoch'].diff().dt.total_seconds()