This PR adds a section to the unique connection analyzer which sums up the connection count, total connection duration, and total (two-way IP) bytes for each unique connection across all of the available chunks and open connection data. The new fields are included in each unique connection document under the following fields:
count
tbytes
tdur
Indexes have been added for each of these fields.
Building on these schema changes, this PR also updates the show-long-connections command to display the unique connections which have been active for the longest total duration. Previously, this module only printed out the longest individual connection for each unique connection connection.
The following fields have been added to the long connections display:
total duration
connection count
total (two-way IP) bytes
The open/ close connection state has also been added to the HTML report. Each of the connection times listed in the HTML report have been properly formatted as seen in the human-readable cli output as well.
I have tested this PR by running one-off imports as well as by using the following bash script:
rita_import_chunked() {
local logdir="$1"
local dataset="$2"
for i in {00..23}; do
local files=`echo "${logdir}"*.${i}\:*-*`
./rita import -R -NC 24 $files "${dataset}"
done
}
I ensured that the summary fields for the rolling imports matched the entries in the dat subdocuments in the one-off imports.
This change requires running a MongoDB aggregation for each unique connection. However, we cannot avoid doing so if we want to roll-up all of the individual dat documents during each import. Still, we should pay extra attention to performance during the PR review. On small datasets, I have not seen an adverse change in performance.
This PR adds a section to the unique connection analyzer which sums up the connection count, total connection duration, and total (two-way IP) bytes for each unique connection across all of the available chunks and open connection data. The new fields are included in each unique connection document under the following fields:
count
tbytes
tdur
Indexes have been added for each of these fields.
Building on these schema changes, this PR also updates the
show-long-connections
command to display the unique connections which have been active for the longest total duration. Previously, this module only printed out the longest individual connection for each unique connection connection.The following fields have been added to the long connections display:
The open/ close connection state has also been added to the HTML report. Each of the connection times listed in the HTML report have been properly formatted as seen in the human-readable cli output as well.
I have tested this PR by running one-off imports as well as by using the following bash script:
I ensured that the summary fields for the rolling imports matched the entries in the
dat
subdocuments in the one-off imports.This change requires running a MongoDB aggregation for each unique connection. However, we cannot avoid doing so if we want to roll-up all of the individual
dat
documents during each import. Still, we should pay extra attention to performance during the PR review. On small datasets, I have not seen an adverse change in performance.