nlpsandbox / nlpsandbox-infra

AWS CloudFormation templates for deploying the NLP Sandbox infrastructure
Apache License 2.0
0 stars 1 forks source link

Document how to push log files and container logs to AWS CloudWatch #33

Closed tschaffter closed 3 years ago

tschaffter commented 3 years ago

Fixes #32

tschaffter commented 3 years ago

The ENA module is already installed by default on Ubuntu as mentioned in the doc.

$ modinfo ena
filename:       /lib/modules/5.4.0-1049-aws/kernel/drivers/net/ethernet/amazon/ena/ena.ko
license:        GPL
description:    Elastic Network Adapter (ENA)
author:         Amazon.com, Inc. or its affiliates
srcversion:     A955949F161A3C6A4995411
alias:          pci:v00001D0Fd0000EC21sv*sd*bc*sc*i*
alias:          pci:v00001D0Fd0000EC20sv*sd*bc*sc*i*
alias:          pci:v00001D0Fd00001EC2sv*sd*bc*sc*i*
alias:          pci:v00001D0Fd00000EC2sv*sd*bc*sc*i*
alias:          pci:v00001D0Fd00000051sv*sd*bc*sc*i*
depends:
retpoline:      Y
intree:         Y
name:           ena
vermagic:       5.4.0-1049-aws SMP mod_unload modversions
signat:         PKCS#7
signer:
sig_key:
sig_hashalgo:   md4
parm:           debug:Debug level (0=none,...,16=all) (int)

However the module is not active yet (below using vif instead of ena):

$ ethtool -i eth0
driver: vif
version:
firmware-version:
expansion-rom-version:
bus-info: vif-0
supports-statistics: yes
supports-test: no
supports-eeprom-access: no
supports-register-dump: no
supports-priv-flags: no

Enabling ena on the EC2:

aws ec2 modify-instance-attribute --instance-id $(wget -q -O - http://169.254.169.254/latest/meta-data/instance-id) --ena-support

Currently failing:

An error occurred (UnauthorizedOperation) when calling the ModifyInstanceAttribute operation: You are not authorized to perform this operation. Encoded authorization failure message: 9VyE0UqN9mtAEh1SoNh1jaEJvO0n8KMOou5DcVVgIbq4GLD7Uf_t2XBo0PnQ8yRxYs1BfOBmJZbNp5OsWLMzSrzSu_XxdVCSu0yG8PBzgDASo65080vOAgnKZ1_SM6e58VgWla_cMeWqInVS0f6ujJaHcMR7lmepRjK2NwD8D16D242v8D4CyQFN6XsFvIwxY9qVzfAS8Rgh3R7dvxS-JGxs7V00cooGZIrzB-oXhqSpPU6S6_Jj5xJimZxxBeYIH5UtDDHJa4Tv_KUPXT0FDTDaZ5KySdQODteWTzE3nc8lNw9JHkp84ngz1V6USfjq4KKIPFh2u1O6DKDBqA5fgzoPHMSEpiEwNuoKZkbFwUbuMoSA_kCzbONgbF2lwtQXlWQbiRhSAGYcQN2nYiGYt00YXiI9ueXJ9hurOs4hRLS0nhSmkvHpi_Ubn8wmRRQUkEmU8x3noarAMqcxsNENV-nQBE4gH8X85FQQ1-E-Ow9TxftgkU3mIoHlpVwRwFtGuP408tU3-DyQLrLrXJD5D1qn-0CJXGSFFHiQlxNTnn40jkqRvg8sV_l2Tg6aV6aoialCtr7Qpl_nMpOuzk8Enx-iMBO0e4VTplYG-rc7MlPsC-vdIF0qiEuCSyRW0uayZKmViS-1oZ8FMzx9rjxS-J8dEsUOXJgDIdshnLcOSqiVJBLta9aZUTEhpUO0L7xDwZwOb3htH6eLkgjtbIL83JY_b-Rpa-wsFaLd-7mCRQ8f7MPHnxn79k0QwCDEAodIwCZ1SGPNfU_hZ-PYK5QYobxltW429fNS6mAkavwNxT4sy6Sr6DmE7jEg58eKq6DH_9QxB4cU3rjM66wMRP_v1T_8MOFGEs07R5sRjQP_kXc9gB0cNDLgobzMWK4ZMjjbkMfj_IAL_1rzWpWlAnLDfzl-XBWlWuny6kTsPy8CW3pAtmBrT2Ggku_c_H8N3BRf6NTLAduGWqqTOwpZfDhCNFpToyOOyGY7gUCegT2XqktboEJNArl0rX3IkuMVVfd0csGVzOcUo7YA8DVX6dWySd4jE91Lzl5fGi5rOA_cqXMV2OYeGDuxC4qWoWA36LFqmQc15384nbJ9gq5c2CfCYNng5vCkrQM2XmVBY3T2rm-vfbOZ_G7-fWk
tschaffter commented 3 years ago

I don't want to spend time at this point on enabling ENA, which is specific to AWS. Ultimately I want metrics to be reported to ELK and independently of AWS tech. Instead added more generic information about "net" and "netstat".

I clone this branch on the data node and run the agent with the configuration from this branch:

sudo /opt/aws/amazon-cloudwatch-agent/bin/amazon-cloudwatch-agent-ctl -m ec2 -a stop
sudo /opt/aws/amazon-cloudwatch-agent/bin/amazon-cloudwatch-agent-ctl -a fetch-config -m ec2 -c file:/home/ubuntu/nlpsandbox-infra/cloudwatch-config.json -s
sudo /opt/aws/amazon-cloudwatch-agent/bin/amazon-cloudwatch-agent-ctl -m ec2 -a status