Open aaronkaplan opened 7 years ago
@aaronkaplan Small note: We don't use table count_by_country
any more, but agg_risk_country_week
. But they are identical
This is not an aggregation issue, but scanned data for last 3 week for DNS and NTP are identical. Only difference between this two files are risk ID's. So aggregation result is identical obviously.
To check difference for latest week
aws --profile cg s3 cp s3://private-bits-cybergreen-net/dev/clean/dns-scan/dns-scan.2017-W03.csv.gz .
aws --profile cg s3 cp s3://private-bits-cybergreen-net/dev/clean/ntp-scan/ntp-scan.20170120.csv.gz .
gunzip ntp-scan.20170120.csv.gz
gunzip dns-scan.2017-W03.csv.gz
# strip timestamps and IPs and ASN for comparison
sed "s/\([^,]*,\)\{3\}//" ntp-scan.20170120.csv > ntp.stripped.csv
sed "s/\([^,]*,\)\{3\}//" dns-scan.2017-W03.csv > dns.stripped.csv
diff ntp.stripped.csv dns.stripped.csv -c | less
No output - files are identical!
cc @chorsley and @kxyne.
Got it. Kayne, chris can you please check?
Mobile
On 21 Mar 2017, at 08:22, Irakli Mchedlishvili notifications@github.com wrote:
@aaronkaplan cc @chorsley and @kxyne. This is not an aggregation issue,but scanned data for last 3 week for DNS and NTP are identical. Only difference between this two files are risk ID's. So aggregation result is identical obviously.
To check difference
aws --profile cg s3 cp s3://private-bits-cybergreen-net/dev/clean/dns-scan/dns-scan.2017-W03.csv.gz . aws --profile cg s3 cp s3://private-bits-cybergreen-net/dev/clean/ntp-scan/ntp-scan.20170120.csv.gz . gunzip ntp-scan.20170120.csv.gz gunzip dns-scan.2017-W03.csv.gz
strip timestamps and IPs and ASN for comparison
sed "s/([^,],){3}//" ntp-scan.20170120.csv > ntp.stripped.csv sed "s/([^,],){3}//" dns-scan.2017-W03.csv > dns.stripped.csv diff ntp.stripped.csv dns.stripped.csv -c | less No output - files are identical!
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub, or mute the thread.
Acknowledged, Cosive will investigate.
This was from a manual file handling issue on the unprocessed files I believe, will rectify along with backprocessing the last weeks files.