Closed victorsantosdevops closed 5 months ago
I'm having the same problem. I can access KMS and SSM properly, just not SQS
I finally figured this out.
In order for the routes to work properly, you need to use a specific URL for the api calls as noted in the docs. The SQS metadata hasn't been updated in a long time and so it does not have this updated URL scheme.
The solution was not clear to me originally since the argument for the send_message
method uses a URL - which I verified was in the proper format. The URL in question is the one where the API call is sent to - the queue URL is just part of the API call's params.
So the fix is to override the endpoint_url
when making your client/resource.
session = boto3.Session()
sqs_client = session.client(
service_name='sqs',
endpoint_url='https://sqs.us-east-1.amazonaws.com',
)
sqs_client.send_message(
QueueUrl='https://sqs.us-east-1.amazonaws.com/...',
MessageBody=json.dumps('my payload'),
)
So the reason we use the alternate endpoint style is to support Python 2.6 as it does not support SNI, which is required for the new endpoints. We would need to drop support for python 2.6-2.7.8. Even then it would still be a breaking change because people have whitelists for particular urls, so changing what we use would break them.
One possibility in the short term is to add a configuration setting to switch over to the new endpoints.
That makes sense. I don't necessarily think configuration would be better since the user would still need to know about the configuration options.
A warning in the docs would be a good start, perhaps at the top of the page and each relevant section. I looked at the docs several times for a clue when I was working through this - that would have likely resolved it quickly.
Another idea would be to log warnings if the user is on py2.7.8+, is using a new-style URL for the queue_url, and has not set the endpoint_url.
Thanks for following up!
Any updates or plans for tackling this issue? We're stuck on older versions of boto3 so we can work with SQS inside our VPCs.
@SteveByerly thanks much for https://github.com/boto/boto3/issues/1900#issuecomment-471047309
And I think a warning in the docs/logs would be good.
@SteveByerly, you're my hero.
Second that. The docs absolutely do not cover this (seems to apply to sqs only) and I burned 8 hours trying to figure it out.
I want to add to the observation, it seems like it's not even consistent across regions. I had same code with same setup working in one region, but failing in another, sending me to investigate networking problems.
Overriding endpoint URL works in both regions, but default sqs_client = boto3.client('sqs')
only in one. Real head scratcher imma tell you.
The proposed solution with the additional endpoint_url
doesn't seem to solve the problem in our case. Just to be sure, it is the same hostname as the queue url, without the path, etc?
So given QueueUrl
: https://sqs.eu-central-1.amazonaws.com/1234567/queue-name the endpoint_url
would be https://sqs.eu-central-1.amazonaws.com?
To avoid confusion a quick follow-up: our problem was related to the lambda not having access rights to the public SQS endpoint. After fixing that, simply using sqs_client = boto3.client('sqs')
worked as expected.
Any updates on this one?
I'm trying to run SQS and celery in AWS with a VPC Endpoint (no NAT gateways). Celery initializes the boto3 client with default parameters, and it's not possible to modify the boto3 client initialization code to set the endpoint_url
parameter to the right url.
I checked that sending a message directly with boto3 and setting endpoint_url
works, but with celery the connection times out cause it tries to connect using the default (legacy) endpoint which is not supported with VPC endpoints.
AWS ref: https://docs.aws.amazon.com/AWSSimpleQueueService/latest/SQSDeveloperGuide/sqs-sending-messages-from-vpc.html
@dt-kylecrayne I'm having the same issue, which boto3 version is working for you with SQS inside your VPCs? Thanks
I found the following workaround overriding boto settings in endpoints.json:
.venv/lib/python3.8/site-packages/botocore/data/endpoints.json
to a known path inside a directory/ (your path may be different depending on where is boto intalled) "queue.{dnsSuffix}"
with "sqs.{region}.{dnsSuffix}"
. This will modify the endpoint url format."protocols" : [ "http", "https" ]
removing "http". SQS VPC endpoints only work through https.AWS_DATA_PATH=/directory/conaining/your/file/
to tell boto to get settings from there first.I hope this helps someone else until this gets fixed
This would be quite simple to fix within botocore. The offending line is 467 in client.py
. A simple check for python version or for ssl.HAS_SNI
to choose either the sslCommonName
or the hostname
should do it. Currently this line simply chooses sslCommonName
if it exists, and hostname
otherwise. For SQS and a couple of other services, the sslCommonName
always exists in current botocore.
Until this gets fixed (as I said, should be simple), I've created a microlibrary that implements a variation of the solution that @marianobrc indicated directly above. You can find this here - https://pypi.org/project/awsserviceendpoints/
Any updates on a fix for this?
this also results in mismatch data between the cli and boto api usage, as the cli for some reason knows how to use the correct endpoint (sqs.region) but the boto api usage doesn't and has the legacy region. when querying queue url the service returns it based on the accessed host, so now we have data inconsistencies as well because of this.
❯ aws sqs list-queues
{
"QueueUrls": [
"https://sqs.us-east-2.amazonaws.com/123456785098/assetdb-ftest-cvKP",
"https://sqs.us-east-2.amazonaws.com/123456785098/dev_policy_deploys",
"https://sqs.us-east-2.amazonaws.com/123456785098/dev_policy_deploys_dlq",
"https://sqs.us-east-2.amazonaws.com/123456785098/local-assetdb",
"https://sqs.us-east-2.amazonaws.com/123456785098/test",
"https://sqs.us-east-2.amazonaws.com/123456785098/test2",
"https://sqs.us-east-2.amazonaws.com/123456785098/test3",
"https://sqs.us-east-2.amazonaws.com/123456785098/test4",
"https://sqs.us-east-2.amazonaws.com/123456785098/test5"
]
}
❯ python
Python 3.10.0 (default, Oct 5 2021, 06:12:41) [GCC 9.3.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import boto3
>>> import pprint
>>> pprint.pprint(boto3.client('sqs').list_queues())
{'QueueUrls': ['https://us-east-2.queue.amazonaws.com/123456785098/assetdb-ftest-cvKP',
'https://us-east-2.queue.amazonaws.com/123456785098/dev_policy_deploys',
'https://us-east-2.queue.amazonaws.com/123456785098/dev_policy_deploys_dlq',
'https://us-east-2.queue.amazonaws.com/123456785098/local-assetdb',
'https://us-east-2.queue.amazonaws.com/123456785098/test',
'https://us-east-2.queue.amazonaws.com/123456785098/test2',
'https://us-east-2.queue.amazonaws.com/123456785098/test3',
'https://us-east-2.queue.amazonaws.com/123456785098/test4',
'https://us-east-2.queue.amazonaws.com/123456785098/test5'],
'ResponseMetadata': {'HTTPHeaders': {'content-length': '989',
'content-type': 'text/xml',
'date': 'Thu, 18 Nov 2021 13:04:54 GMT',
'x-amzn-requestid': '554b37b9-02bd-5e12-ad5a-6da9530bfb45'},
'HTTPStatusCode': 200,
'RequestId': '554b37b9-02bd-5e12-ad5a-6da9530bfb45',
'RetryAttempts': 0}}
it feels like madness to me that the sdk is forcing all its users to work around it.
is there a default sane configuration without having to manually pass in endpoint, ie. how is the awscli doing the right thing?
can we get a environment flag similiar to sts regional endpoints?
Resolved the issue by putting lambda function in private subnet and allowing internet access using NAT gateway.
VPC -> create private subnets -> create NAT Gateway in public subnet -> attach private subnets to NAT Gateway -> lambda configuration update VPC setting.
session = boto3.Session(region_name="ca-central-1") sqs = session.client(service_name='sqs', endpoint_url='https://sqs.ca-central-1.amazonaws.com')
I have had a lambda function sending messages to a sqs queue configured with a vpc, it has been working normally for several months, but now out of nowhere no messages are sent and the function times out. The Lambda function is in a private subnet.
Change the security group ingress rules to allow all traffic, that works. Previously the configurations allowed access through port 22 and 2049, which port should be added for the correct functioning of the sqs queues?
Change the security group ingress rules to allow all traffic, that works. Previously the configurations allowed access through port 22 and 2049, which port should be added for the correct functioning of the sqs queues?
Same thing happened to me. Lambda running with the VPC set up, there is a endpoint created so the resources within the VPN can access SQS endpoints. All working fine for years. Suddenly lambdas started to timeout and couldn't resolve SQS endoints. Opened the doors as @sejr1996 mentioned as a last resort and it worked for now.
This issue has been addressed — you can test by running:
import boto3
session = boto3.Session()
boto3.set_stream_logger('')
sqs_client = session.client(
service_name='sqs',
region_name='us-east-1'
)
response = sqs_client.list_queues()
print(response)
And see in the logs that it resolves to the correct:
Endpoint provider result: https://sqs.us-east-1.amazonaws.com
Please update to a newer version of Boto3 for access to the latest functionality. The most recent version is 1.34.125 per the CHANGELOG. And note that Python 3.8+ is required.
SQS endpoints for reference: https://docs.aws.amazon.com/general/latest/gr/sqs-service.html. If you want to use a custom or legacy endpoint you could set the service-specific endpoint AWS_ENDPOINT_URL_SQS
to the value you need.
This issue is now closed. Comments on closed issues are hard for our team to see. If you need more assistance, please open a new issue that references this one.
when i try send sqs message from lambda in a VPC, i get timeout. I tryed use the VPC Link, but dont work. { "errorMessage": "2019-03-07T13:45:11.739Z 7cb1fd0f-7b84-4fcd-8775-01f0f374a0a9 Task timed out after 15.01 seconds" }
SG Outbound ALL Open and NACL too. I already create the VPC Link.
Function Logs [INFO] 2019-03-07T13:44:56.744Z 7cb1fd0f-7b84-4fcd-8775-01f0f374a0a9 Start with Hash: 1111114502ff8532d063b9d988e2406a [INFO] 2019-03-07T13:44:56.744Z 7cb1fd0f-7b84-4fcd-8775-01f0f374a0a9 msgData: {'msgBody': 'Howdy @ 2019-03-07 13:44:56', 'msgAttributes': {'hash': {'StringValue': '1111114502ff8532d063b9d988e2406a', 'DataType': 'String'}}} 2019-03-07 13:45:11.739 7cb1fd0f-7b84-4fcd-8775-01f0f374a0a9 ask timed out after 15.01 secondsundefined
If i remove the VPC all work fine... But i need this fuction working inside a VPC anyone help me please T_T