cybergreen-net / pm

Tech project management repo (issue tracker only)
2 stars 1 forks source link

DNS and NTP columns are the same because the data in the DB is the same for both risks! #100

Open aaronkaplan opened 7 years ago

aaronkaplan commented 7 years ago
prod=> select sum(count) from count_by_country where date > '2017/1/1' and risk=2 and country='US' limit 10;
   sum
---------
 2585757
(1 row)

prod=> select sum(count) from count_by_country where date > '2017/1/1' and risk=1 and country='US' limit 10;
   sum
---------
 2585757
(1 row)
zelima commented 7 years ago

@aaronkaplan Small note: We don't use table count_by_country any more, but agg_risk_country_week. But they are identical

This is not an aggregation issue, but scanned data for last 3 week for DNS and NTP are identical. Only difference between this two files are risk ID's. So aggregation result is identical obviously.

To check difference for latest week

aws --profile cg s3 cp  s3://private-bits-cybergreen-net/dev/clean/dns-scan/dns-scan.2017-W03.csv.gz .
aws --profile cg s3 cp  s3://private-bits-cybergreen-net/dev/clean/ntp-scan/ntp-scan.20170120.csv.gz .
gunzip ntp-scan.20170120.csv.gz
gunzip dns-scan.2017-W03.csv.gz

# strip timestamps and IPs and ASN for comparison
sed "s/\([^,]*,\)\{3\}//" ntp-scan.20170120.csv > ntp.stripped.csv
sed "s/\([^,]*,\)\{3\}//" dns-scan.2017-W03.csv > dns.stripped.csv
diff ntp.stripped.csv dns.stripped.csv -c | less

No output - files are identical!

cc @chorsley and @kxyne.

aaronkaplan commented 7 years ago

Got it. Kayne, chris can you please check?


Mobile

On 21 Mar 2017, at 08:22, Irakli Mchedlishvili notifications@github.com wrote:

@aaronkaplan cc @chorsley and @kxyne. This is not an aggregation issue,but scanned data for last 3 week for DNS and NTP are identical. Only difference between this two files are risk ID's. So aggregation result is identical obviously.

To check difference

aws --profile cg s3 cp s3://private-bits-cybergreen-net/dev/clean/dns-scan/dns-scan.2017-W03.csv.gz . aws --profile cg s3 cp s3://private-bits-cybergreen-net/dev/clean/ntp-scan/ntp-scan.20170120.csv.gz . gunzip ntp-scan.20170120.csv.gz gunzip dns-scan.2017-W03.csv.gz

strip timestamps and IPs and ASN for comparison

sed "s/([^,],){3}//" ntp-scan.20170120.csv > ntp.stripped.csv sed "s/([^,],){3}//" dns-scan.2017-W03.csv > dns.stripped.csv diff ntp.stripped.csv dns.stripped.csv -c | less No output - files are identical!

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub, or mute the thread.

chorsley commented 7 years ago

Acknowledged, Cosive will investigate.

kxyne commented 7 years ago

This was from a manual file handling issue on the unprocessed files I believe, will rectify along with backprocessing the last weeks files.