Open lawrencegripper opened 5 years ago
I've updated this to include some more data, looking at this work item now I'm starting to feel like this is a rather large bit of work.
Also does the following limitation present an issue for using Log Analytics with the solution @marrobi
A Log Analytics workspace is currently supported in the following regions:
West Central US
East US
West Europe
Southeast Asia1
Portal disagrees.
https://docs.microsoft.com/en-us/azure/azure-monitor/insights/vminsights-onboard#log-analytics
Worth raising on the doc?
Currently dealing with timing issues trying to install agents and other stuff. @lawrencegripper has suggested approach that might get around this.
One approach would be to run the Log Agent
from within a docker container. Here is an example of going this from Kubernetes mounting the log files into the container so it can access them. https://github.com/lawrencegripper/azure-aks-terraform/blob/master/oms/oms.tf
So using:
wget https://raw.githubusercontent.com/Microsoft/OMS-Agent-for-Linux/master/installer/scripts/onboard_agent.sh && sh onboard_agent.sh -w <YOUR OMS WORKSPACE ID> -s <YOUR OMS WORKSPACE PRIMARY KEY>
In cloud-init logs I see:
82650K .......... .......... .......... .......... .......... 75% 119M 5s
82700K .......... .......... .......... .......... .......... 75% 214M 5s
82750K .......... .......... .......... .......... .......... 75% 157M 5s
82800K .......... .......... .......... .......... .......... 75% 167M 5s
82850K .......... .......... .......... .......... .......... 75% 189M 5s
82900K .......... .......... .......... .......... .......... 75% 25.8M 5s
82950K .......... .......... .......... .......... .......... 75% 50.6M 5s
83000K .......... .......... .......... .......... .......... 75% 3.13M 5s
83050K .......... .......... .......... .......... .......... 75% 27.4M 5s
83100K .......... .......... .......... .......... .......... 75% 29.5M 5s
83150K .......... .......... .......... .......... .......... 75% 40.4M 5s
83200K .......... .......... .......... .......... ... 75% 43.2M=14s
2019-01-22 23:01:38 (5.61 MB/s) - Read error at byte 85241040/112281798 (Connection reset by peer). Retrying.
--2019-01-22 23:01:39-- (try: 2) https://github-production-release-asset-2e65be.s3.amazonaws.com/43709699/4009ab00-e10d-11e8-9798-991dfd11b98b?X-Amz-Algorithm=AWS4-HMAC-SHA256&X-Amz-Credential=AKIAIWNJYAX4CSVEH53A%2F20190122%2Fus-east-1%2Fs3%2Faws4_request&X-Amz-Date=20190122T225622Z&X-Amz-Expires=300&X-Amz-Signature=2b4a640dabb7132620c23ab1ecb0732c8a8e40f6199705a50870369eb79ae6d7&X-Amz-SignedHeaders=host&actor_id=0&response-content-disposition=attachment%3B%20filename%3Domsagent-1.8.1-256.universal.x64.sh&response-content-type=application%2Foctet-stream
Connecting to github-production-release-asset-2e65be.s3.amazonaws.com (github-production-release-asset-2e65be.s3.amazonaws.com)|52.216.165.139|:443... connected.
HTTP request sent, awaiting response... 403 Forbidden
2019-01-22 23:01:39 ERROR 403: Forbidden.
If run interactively downloaded and installed without issue. Will troubleshoot.
curl works, or
sudo docker run --privileged -d -v /var/run/docker.sock:/var/run/docker.sock -v /var/lib/docker/containers:/var/lib/docker/containers -e WSID="your workspace id" -e KEY="your key" -h=`hostname` -p 127.0.0.1:25225:25225 --name="omsagent" --restart=always microsoft/oms
might be better
Need to verify correct functionality including:
Have emailed people to ask about HDInsight, Ambari works, but doesn't look like performance data is coming through to azure.
Lack of HD Insight metrics confirmed as a known issue. Awaiting update from engineering.
blocked until we get a fix/feedback from product group
This is now fixed as far as data showing, witnessed working today. A couple of performance counters are still missing, but believe I should be able to fix by adding missing values to log analytics data sources in the ARM tempalte.
Nice that's going to be useful during testing! Are we good to close this one out now?
Let me try fix the missing counters in the image above. Will try do that today.
Still missing data:
Awaiting response from product team.
As we start doing performance testing on the solution having insight into CPU, Memory and Network usage will be need to tune and tweak.
The easiest option looks to be enabling the container monitoring solution in Azure Monitoring however we'll need to do some testing to make sure this fits the bill. Also we will likely want to get stats from the HBase nodes, Solr and Mongo
Links: