Closed nevosial closed 5 years ago
Is this the full traceback? Could you post the section of your code that is getting this?
Using the CLI
$ flowlogs_reader "my-LogGroup" findip
^CTraceback (most recent call last):
File "/Library/Frameworks/Python.framework/Versions/3.6/bin/flowlogs_reader", line 11, in <module>
sys.exit(main())
File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/flowlogs_reader/__main__.py", line 235, in main
action_method(reader, *args.action[1:])
File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/flowlogs_reader/__main__.py", line 69, in action_findip
for record in reader:
File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/flowlogs_reader/flowlogs_reader.py", line 193, in __next__
return next(self.iterator)
File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/flowlogs_reader/flowlogs_reader.py", line 202, in _reader
for event in self._read_streams():
File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/flowlogs_reader/flowlogs_reader.py", line 248, in _read_streams
for page in response_iterator:
File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/botocore/paginate.py", line 255, in __iter__
response = self._make_request(current_kwargs)
File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/botocore/paginate.py", line 332, in _make_request
return self._method(**current_kwargs)
File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/botocore/client.py", line 320, in _api_call
return self._make_api_call(operation_name, kwargs)
File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/botocore/client.py", line 611, in _make_api_call
operation_model, request_dict)
File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/botocore/endpoint.py", line 102, in make_request
return self._send_request(request_dict, operation_model)
File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/botocore/endpoint.py", line 147, in _send_request
request, operation_model, attempts)
File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/botocore/endpoint.py", line 179, in _get_response
http_response = self._send(request)
File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/botocore/endpoint.py", line 223, in _send
return self.http_session.send(request)
File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/botocore/httpsession.py", line 272, in send
http_response.content
File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/botocore/awsrequest.py", line 587, in content
self._content = bytes().join(self.raw.stream()) or bytes()
File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/urllib3/response.py", line 494, in stream
data = self.read(amt=amt, decode_content=decode_content)
File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/urllib3/response.py", line 442, in read
data = self._fp.read(amt)
File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/http/client.py", line 449, in read
n = self.readinto(b)
File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/http/client.py", line 493, in readinto
n = self.fp.readinto(b)
File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/socket.py", line 586, in readinto
return self._sock.recv_into(b)
File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/ssl.py", line 1012, in recv_into
return self.read(nbytes, buffer)
File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/ssl.py", line 874, in read
return self._sslobj.read(len, buffer)
File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/ssl.py", line 631, in read
v = self._sslobj.read(len, buffer)
KeyboardInterrupt
Did the original exception occur after you pressed Ctrl+C? I expect that this is the actual cause of the error message shown.
Also, can you try using the --start-time
filter to limit the results?
I ran the example using the python library.
start = datetime.datetime(2018,12,3,21,10,0)
end = datetime.datetime(2018,12,3,21,30,0)
flow_logs_reader = FlowLogsReader(log_group_name="myLogGroup",boto_client_kwargs=my_map, start_time=start, end_time=end)
print(len(list(flow_logs_reader)))
I get the same error after waiting for more than 15 mins and then stopping the execution in pycharm.(have attached the full error log)
Thanks for the details.
What about a really short time window - like one minute. Can you get results to come back from that?
Thanks. I did try it for 1 and 2 minute intervals; i get the same error.
start = datetime.datetime(2018,12,3,21,11,0) end = datetime.datetime(2018,12,3,21,12,0)
Let's take this library out of the loop.
Install the AWS CLI: pip install awscli
Make the same query with it:
$ aws --region="some-region" logs filter-log-events --log-group-name="some-group" --start-time="1544040000" --end-time="1544040060"
How long does that take?
Thanks. Without the lib the response is quick. Above command gave an output within 5 seconds.
OK, that's interesting. Could you tell me what versions of botocore
and boto3
you are using? You can see that this library is using the very same API calls.
Yes i see that but I'm not sure if i'm missing something here,
boto3 - 1.9.59 botocore - 1.12.59
from flowlogs_reader import FlowLogsReader
import datetime, time
import boto3
client = boto3.client('logs')
my_map = {'aws_access_key_id':'mykey','aws_secret_access_key':'myaccess', 'region_name':'my-region-1'}
start = datetime.datetime(2018,12,3,21,10,0)
end = datetime.datetime(2018,12,3,21,11,0)
flow_logs_reader = FlowLogsReader(log_group_name="myLogGroup", boto_client_kwargs=my_map, start_time=start, end_time=end, boto_client=client)
print(list(flow_logs_reader))
Hmm, could you delete the client = boto3.client('logs')
bit? As you can see here, boto_client_kwargs
are not read if a client is supplied. So your key and such aren't being read.
Also: when you're pasting in code snippets, please use ``` instead of ` to make sure the lines are rendered properly.
Yes, Thank you. removing the client variable doesn't help. i see the same error.
I tried updating my
my_map = {'service_name':'logs', 'aws_access_key_id':'mykey','aws_secret_access_key':'myaccess', 'region_name':'my-region-1'}
this update doesn't help as well.
Let's take this library out of the loop again:
Can you run something like this?
import boto3
session = boto3.Session(
aws_access_key_id='...',
aws_secret_access_key='...',
aws_session_token='...',
)
client = session.client('logs', region_name='...')
resp = client.filter_log_events(logGroupName='...'):
print(len(resp['events']))
My suspicion is that your boto3
client isn't actually being initialized with the right information.
If i do not use the library, i can access and get back the desired result. As said before as well as trying the above code snippet gives me the correct result.
client = boto3.client('logs')
resp = client.filter_log_events(logGroupName="my_group", limit=10)
print(resp['events'])
Also gives me an output.
There must be an answer here. The library uses the get_paginator
instead of calling filter_log_events
directly (here).
So the next test would be to do:
paginator = client.get_paginator('filter_log_events')
response_iterator = paginator.paginate(
logGroupName='my_group',
interleaved=True,
)
for page in response_iterator:
for event in page['events']:
print(event)
assert False
If that doesn't work, try leaving the interleaved
bit out?
The get_paginator
approach (above) works too with and without the interleaved
bit !
We're missing something, I'm sure.
Start with the client that is working manually. Now do:
# Ensure the standard client is working
standard_flow = client.filter_log_events(logGroupName='flowlog_group')['events'][0]
print(standard_flow)
# Attach the standard client to the library and retrieve a flow
reader = FlowLogsReader('flowlog_group')
reader.boto_client = client
library_flow = next(iter(reader)).to_dict()
print(library_flow)
This attaches the working boto client to the reader. Then the reader does exactly what we just did.
What's that give?
The above snippet also works and provides the output. I added the start_time and end_time filter to the above reader; which works fine as well.
Well that's a relief.
This should do the same thing:
# Ensure the standard client is working
standard_flow = client.filter_log_events(logGroupName='flowlog_group')['events'][0]
print(standard_flow)
# Attach the standard client to the library and retrieve a flow
reader = FlowLogsReader('flowlog_group', boto_client=client)
library_flow = next(iter(reader)).to_dict()
print(library_flow)
Does it? If not, could you do print(id(client))
and print(id(reader.boto_client))
?
Actually, this is what i tried previously along with the start_time and end_time. This works and I see that the client id in both cases are the same.
OK, so when used as a library everything is working as expected?
Does the CLI tool, without the findip
, produce output?
$ flowlogs_reader --region="region-name" "flowlog_group"
The CLI tool without findip
param works fine.
Not sure if i understand right, the library doesn't work as expected. The following code still doesn't work (here i'm setting the session explicitly),
import boto3
session = boto3.Session(
aws_access_key_id='...',
aws_secret_access_key='...',
aws_session_token='...',
)
client = session.client('logs', region_name='my-region-1')
start = datetime.datetime(2018,12,3,21,10,0)
end = datetime.datetime(2018,12,3,21,11,0)
flow_logs_reader = FlowLogsReader(log_group_name, boto_client=client, start_time=start, end_time=end)
print(len(list(flow_logs_reader)))
The error is the same as seen in the error_log.txt
If you do what we said above, that works, right?
reader = FlowLogsReader('flowlog_group', boto_client=client)
library_flow = next(iter(reader)).to_dict()
print(library_flow)
That is retrieving one flow.
That's a lot different from print(len(list(flow_logs_reader)))
, which is retrieving every flow. If you change it to:
for flow in flow_logs_reader:
print(flow)
then that probably prints a ton of stuff and takes a long time, right?
For those finding this issue from Google: the No response due to getresponse() got an unexpected keyword argument 'buffering'
thing is actually being caused by another exception.
In this case it's a KeyboardInterrupt
or a connection timeout, but for other code it may be something else. Look to what's calling urllib3
!
For others having an issue pulling in large amounts of flow log data from AWS CloudTrail in a reasonable amount of time, (a) consider using the newer S3 destination (which you can read from with this library), or (b) consider setting up a Kinesis subscription to your CWL group and using this library.
The flow_logs_reader hangs while trying to retrieve large result dataset.
File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/urllib3/connectionpool.py", line 377, in _make_request httplib_response = conn.getresponse(buffering=True) TypeError: getresponse() got an unexpected keyword argument 'buffering'