When performing an awslogs get operation on a log group with many streams, even if I I specify the full log stream name I want to search, the performance is either very slow, or if the group has enough streams, I get a ThrottlingException error.
I've run into this when searching logs from AWS Batch. Batch uses the same log group for all jobs, /aws/batch/jobs, and puts the output from each job into its own stream. This means the /aws/batch/jobs log group ends up with a large number of streams if you use batch a lot. However, this shouldn't be a problem if I know the log stream I want to search.
For example, if I ran
awslogs get -GS -s 1d /aws/batch/job my-job/default/309e41b6173e4bb98171fb3529a58092
where the last argument is the complete log stream name, I would have expected fast performance since it doesn't need to search all log streams. However what the code actually does is treat the stream name as a regex, and list every log stream in the group and compare its name to the given pattern. This causes a ThrottlingException for me. In other instance where the group doesn't have quite so many streams, the problem manifests as just very slow performance.
Here is the error I get when I run the above command:
Traceback (most recent call last):
File "/Users/ajenkins/.local/bin/awslogs", line 8, in <module>
sys.exit(main())
File "/Users/ajenkins/.local/pipx/venvs/awslogs/lib/python3.6/site-packages/awslogs/bin.py", line 179, in main
getattr(logs, options.func)()
File "/Users/ajenkins/.local/pipx/venvs/awslogs/lib/python3.6/site-packages/awslogs/core.py", line 109, in list_logs
streams = list(self._get_streams_from_pattern(self.log_group_name, self.log_stream_name))
File "/Users/ajenkins/.local/pipx/venvs/awslogs/lib/python3.6/site-packages/awslogs/core.py", line 102, in _get_streams_from_pattern
for stream in self.get_streams(group):
File "/Users/ajenkins/.local/pipx/venvs/awslogs/lib/python3.6/site-packages/awslogs/core.py", line 261, in get_streams
for page in paginator.paginate(**kwargs):
File "/Users/ajenkins/.local/pipx/venvs/awslogs/lib/python3.6/site-packages/botocore/paginate.py", line 255, in __iter__
response = self._make_request(current_kwargs)
File "/Users/ajenkins/.local/pipx/venvs/awslogs/lib/python3.6/site-packages/botocore/paginate.py", line 332, in _make_request
return self._method(**current_kwargs)
File "/Users/ajenkins/.local/pipx/venvs/awslogs/lib/python3.6/site-packages/botocore/client.py", line 357, in _api_call
return self._make_api_call(operation_name, kwargs)
File "/Users/ajenkins/.local/pipx/venvs/awslogs/lib/python3.6/site-packages/botocore/client.py", line 661, in _make_api_call
raise error_class(parsed_response, operation_name)
botocore.exceptions.ClientError: An error occurred (ThrottlingException) when calling the DescribeLogStreams operation (reached max retries: 4): Rate exceeded
I've just created a pull request to fix this. I added a --stream-prefix option to awslogs get, which tells it to treat the log stream argument as a string prefix instead of as a regex. Then it can pass it as the logStreamNamePrefix argument to describe_log_streams, which results in much faster performance, since the filtering is done in AWS. This completely fixes the problem for me.
When performing an
awslogs get
operation on a log group with many streams, even if I I specify the full log stream name I want to search, the performance is either very slow, or if the group has enough streams, I get aThrottlingException
error.I've run into this when searching logs from AWS Batch. Batch uses the same log group for all jobs,
/aws/batch/jobs
, and puts the output from each job into its own stream. This means the/aws/batch/jobs
log group ends up with a large number of streams if you use batch a lot. However, this shouldn't be a problem if I know the log stream I want to search.For example, if I ran
where the last argument is the complete log stream name, I would have expected fast performance since it doesn't need to search all log streams. However what the code actually does is treat the stream name as a regex, and list every log stream in the group and compare its name to the given pattern. This causes a
ThrottlingException
for me. In other instance where the group doesn't have quite so many streams, the problem manifests as just very slow performance.Here is the error I get when I run the above command:
I've just created a pull request to fix this. I added a
--stream-prefix
option toawslogs get
, which tells it to treat the log stream argument as a string prefix instead of as a regex. Then it can pass it as thelogStreamNamePrefix
argument todescribe_log_streams
, which results in much faster performance, since the filtering is done in AWS. This completely fixes the problem for me.