aws-greengrass / aws-greengrass-nucleus

The Greengrass nucleus component provides functionality for device side orchestration of deployments and lifecycle management for execution of Greengrass components and applications. This includes features such as starting, stopping, and monitoring execution of components and apps, interprocess communication server for communication between components, component installation and configuration management.
Apache License 2.0
108 stars 46 forks source link

Greengrass data usage #1663

Closed wills721 closed 3 days ago

wills721 commented 1 month ago

Feature Description data usage

Use Case understand mobile carrier data charges

Proposed Solution unknown

Other [Add detailed explanation, stacktraces, related issues, links for us to have context, etc] (https://repost.aws/questions/QUf5la_hFMQsi2EwHuzckWNA/greengrass-carrier-data-charges)

I have seen several monitoring tools related to monitoring traffic between greengrass and AWS but none yet that really do the job of helping me understand data usage?

yitingb commented 1 month ago

Hi, on rpi, it's possible to use Wireshark to capture packets https://www.wireshark.org/docs/wsug_html_chunked/AppToolstcpdump.html.

We need some more information to understand what's going on with the traffic.

  1. What time period did you notice the high data charges?
  2. Are you comfortable with sharing the deployment document?
  3. What are the target components of your deployment?
wills721 commented 1 month ago

The carrier really doesnt get specific -- its just a large monthly charge :)

So, I'm wondering how to sort of 'profile' what is going on or whether greengrass has some way or report to help tell me the size of all data travelling back and forth to validate the carrier charge?

wills721 commented 1 month ago

So I am experimenting with Wireshark. So far no real luck -- I'm surprised no one else has run into this. We use GG on RPi with a mobile cell dongle.

wills721 commented 2 weeks ago

I am still trying to understand how to profile Greengrass. I am certain the large data xfer is something going on in Greengrass. I have removed LogManager and also LocalDebugConsole. This leaves CLI, Diskspooler, Nucleus, clientdevices.Auth, clientdevices.IPDetector. Would any of the latter result in >40GB daily usage?

MikeDombo commented 2 weeks ago

IP detector will update the IP address whenever it changes, this can happen frequently on mobile connections. Check the greengrass log file to see what is going on; any update is logged.

Use wireshark to see how many data is transferring over port 8883, 8443, and 443. These are the ports used by Greengrass.

MQTT has a keepalive packet which is sent every 30 seconds, this is configurable in the Nucleus configurations.

n9wxu commented 2 weeks ago

Alternatively you can install tcpdump and use a filter like. sudo tcpdump | grep amazonaws.com This will print ALL traffic that is working with AWS even if it does not originate from greengrass. FYI: tcpdump does not have wildcard host filters so you must use grep to find ALL communications headed to AWS instead of to a specific endpoint.

Even a few seconds of TCP dump should show lots of large traffic to reach 40GB/day. ~444KB/Sec. You may want to just run tcpdump without any other parameters for just a few seconds. You will probably see a pattern pretty quick considering the amount of unknown data you are seeing.

In the sequence below you see the MQTT keepalive's every minute between my basic Greengrass device and AWS. Note my Greengrass device has no additional components deployed.

$ sudo tcpdump | grep amazonaws.com
tcpdump: verbose output suppressed, use -v[v]... for full protocol decode
listening on eth0, link-type EN10MB (Ethernet), snapshot length 262144 bytes
15:32:44.136454 IP sirius-1.localdomain.39424 > ec2-XXXXXXXXXX.us-east-2.compute.amazonaws.com.8883: Flags [P.], seq 3913067436:3913067467, ack 46105621, win 614, options [nop,nop,TS val 4093317550 ecr 2932339164], length 31
15:32:44.198069 IP ec2-XXXXXXXXXX.us-east-2.compute.amazonaws.com.8883 > sirius-1.localdomain.39424: Flags [P.], seq 1:32, ack 31, win 425, options [nop,nop,TS val 2932399222 ecr 4093317550], length 31
15:32:44.198133 IP sirius-1.localdomain.39424 > ec2-XXXXXXXXXX.us-east-2.compute.amazonaws.com.8883: Flags [.], ack 32, win 614, options [nop,nop,TS val 4093317612 ecr 2932399222], length 0
15:33:44.179906 IP sirius-1.localdomain.39424 > ec2-XXXXXXXXXX.us-east-2.compute.amazonaws.com.8883: Flags [P.], seq 31:62, ack 32, win 614, options [nop,nop,TS val 4093377594 ecr 2932399222], length 31
15:33:44.241463 IP ec2-XXXXXXXXXX.us-east-2.compute.amazonaws.com.8883 > sirius-1.localdomain.39424: Flags [P.], seq 32:63, ack 62, win 425, options [nop,nop,TS val 2932459266 ecr 4093377594], length 31
15:33:44.241526 IP sirius-1.localdomain.39424 > ec2-XXXXXXXXXX.us-east-2.compute.amazonaws.com.8883: Flags [.], ack 63, win 614, options [nop,nop,TS val 4093377656 ecr 2932459266], length 0
15:34:44.239989 IP sirius-1.localdomain.39424 > ec2-XXXXXXXXXX.us-east-2.compute.amazonaws.com.8883: Flags [P.], seq 62:93, ack 63, win 614, options [nop,nop,TS val 4093437654 ecr 2932459266], length 31
15:34:44.303297 IP ec2-XXXXXXXXXX.us-east-2.compute.amazonaws.com.8883 > sirius-1.localdomain.39424: Flags [P.], seq 63:94, ack 93, win 425, options [nop,nop,TS val 2932519327 ecr 4093437654], length 31
15:34:44.303362 IP sirius-1.localdomain.39424 > ec2-XXXXXXXXXX.us-east-2.compute.amazonaws.com.8883: Flags [.], ack 94, win 614, options [nop,nop,TS val 4093437717 ecr 2932519327], length 0
n9wxu commented 2 weeks ago

One additional thought: ip -s link show eth0 replace eth0 with your cellular link This will show the byte and packet count on the specified interface. You can quickly determine if your interface traffic matches your cellular providers numbers.