Closed thesuhas closed 3 months ago
Thanks for reporting this, @thesuhas.
The spin_sdk.wit
code in the spin_sdk
package is not actually used at runtime -- that code is just stub bindings generated for the purposes of documentation and IDE integration. The actual code is generated by componentize-py
at build time such that each imported function calls into the host instead of raising a NotImplementedError
. That also affects the line numbers, since the function bodies are different. Sorry for the confusion.
In this case, I'm not surprised the /bookings/dwight_schrute
path doesn't work, since wasi:http
needs a full URL including a protocol and an authority. Have you tried following https://developer.fermyon.com/spin/v2/http-outbound#making-http-requests-within-an-application ? I think that's the official way to do this, but @itowlson could probably correct me if I'm mistaken.
Oh, but I just noticed that http://self
should work according to https://developer.fermyon.com/spin/v2/http-outbound#intra-application-http-requests-by-route. Hopefully @itowlson can tell us what we're missing here.
I think I have figured out what the issue is. In the rust sdk:
impl OutboundHttp {
/// Check if guest module is allowed to send request to URL, based on the list of
/// allowed hosts defined by the runtime. If the url passed in is a relative path,
/// only allow if allowed_hosts contains `self`. If the list of allowed hosts contains
/// `insecure:allow-all`, then all hosts are allowed.
/// If `None` is passed, the guest module is not allowed to send the request.
fn is_allowed(&mut self, url: &str) -> Result<bool, HttpError> {
if url.starts_with('/') {
return Ok(self.allowed_hosts.allows_relative_url(&["http", "https"]));
}
Ok(OutboundUrl::parse(url, "https")
.map(|u| self.allowed_hosts.allows(&u))
.unwrap_or_default())
}
}
That check is not done here.
I believe it can be an easy fix. Something like:
url_parsed = parse.urlparse(request.uri)
if url_parsed.netloc == '' and url_parsed.path.startswith('/'):
# If not allowed, throw error
# Else set scheme to http/https and netloc to 127.0.0.1:port
I am unable to see where in the python sdk you have the list of allowed hosts to implement this. I've changed outbound hosts to:
allowed_outbound_hosts = ["http://self", "https://self"]
and tried sending a request to http://self/users/{username}/bookings
.
The backtrace is:
From what I gather, this exception is not thrown in any python code.
@dicej Paths are meant to work. The Spin runtime translates them into self-requests (provided self
is an allowed host) by prepending the host and protocol of the original inbound request.
@thesuhas I don't think http://self
works as part of the URL - it is a magic identifier for allowed_outbound_hosts
. As you note, https://developer.fermyon.com/spin/v2/http-outbound#intra-application-http-requests-by-route describes this. The expected pseudo-URL is /api/customers
.
However, as Joel notes, if your target environment supports it, it's more efficient to use local service chaining (https://developer.fermyon.com/spin/v2/http-outbound#local-service-chaining), which does not travel over the network.
@thesuhas regarding where the error is created, it is in the Spin runtime - the OutboundHttp::is_allowed
fragment you quote is part of the Spin host, not part of the Rust SDK.
Thanks @itowlson !
But, having allowed_outbound_hosts = ["http://self", "https://self"]
and just using a url such as /api/customers
seems to be throwing an error for now.
If I would not want to switch to local service chaining, how could I fix this?
I am not sure @thesuhas. I believe it ought to work, and unless I am missing something it's a bug that it doesn't. I will have a look but I don't know much about the Python side.
Sounds like we need to add the equivalent relative URL handling @thesuhas found in the Rust SDK to the Python SDK, if I'm understanding correctly.
Actually, where did you find that code in the Rust SDK, @thesuhas ? I'm not seeing it.
It's not in the Rust SDK though @dicej. That's the Spin Outbound HTTP host component.
If I would want to add the equivalent relative URL handling and make a contribution, how would I gain access to both the allowed outbound hosts , the url and the port that spin is deploying the application on? If one could access them, this check could be added into the python sdk itself.
Actually, where did you find that code in the Rust SDK, @thesuhas ? I'm not seeing it.
I found it here: https://github.com/fermyon/spin/blob/9f1d9dbc43ae0bde8ffe3729ac9b545b7d571e15/crates/outbound-http/src/host_impl.rs#L23
Oh, actually I think this needs to be ported to the Python SDK: https://github.com/fermyon/spin-rust-sdk/blob/184946fdd3cb48de3c3af60f8e053b42947c791b/src/http/conversions.rs#L557-L560
It's not in the Rust SDK though @dicej. That's the Spin Outbound HTTP host component.
Does this mean that this is where the check happens even while using the python sdk? And does this mean spin composes an Outbound HTTP component to the spin component it builds and the flow of the request is Spin Component Created by User -> Outbound HTTP Component -> Request is sent
?
The problem is that with the current Python SDK, the relative URL requests aren't even making it to the host, since set_authority
is choking. We need to port the above code I linked to from spin-rust-sdk
to fix that.
@thesuhas It's not quite composition, but conceptually similar. The HTTP service is provided by the Spin host, and the Spin host understands the special self-request syntax and service chaining URLs. So when you make a HTTP request through any SDK, that gets passed through to the Spin host which, after validating, performs the actual request.
The outbouind hosts validation etc. has to be in the runtime rather than in the SDK, because the SDK is strictly optional. You can write a Spin component using raw component tooling and the Spin WITs. We cannot depend on the SDK for security checking.
The problem is that with the current Python SDK, the relative URL requests aren't even making it to the host, since
set_authority
is choking. We need to port the above code I linked to fromspin-rust-sdk
to fix that.
How would one port that? Currently most of the code in wit/types.py
is placeholders for componentize-py
right?
Considering https://github.com/fermyon/spin-rust-sdk/commit/9f59a5ea6d4575618bc4bbc63b6825ec63c2c341, I would assume you would check the authority in the python sdk and if not, prepend the URL?
@thesuhas It's not quite composition, but conceptually similar. The HTTP service is provided by the Spin host, and the Spin host understands the special self-request syntax and service chaining URLs. So when you make a HTTP request through any SDK, that gets passed through to the Spin host which, after validating, performs the actual request.
The outbouind hosts validation etc. has to be in the runtime rather than in the SDK, because the SDK is strictly optional. You can write a Spin component using raw component tooling and the Spin WITs. We cannot depend on the SDK for security checking.
So this means that the only fix in this case would be setting the authority correctly as then the spin host would have all the information it needs to validate and perform the request?
And in this case, authority
would be the domain name and port right? And domain name would be the URL at which the spin application is running (localhost if running locally) and port would just be the port that the spin application is listening at?
@thesuhas Looking at the Rust SDK code I think the idea is not to prepend a full URL but to inject a dummy authority consisting only of the port.
Can you try this, please?
diff --git a/src/spin_sdk/http/__init__.py b/src/spin_sdk/http/__init__.py
index dd98107..094d94a 100644
--- a/src/spin_sdk/http/__init__.py
+++ b/src/spin_sdk/http/__init__.py
@@ -174,7 +174,14 @@ async def send_async(request: Request) -> Response:
outgoing_request = OutgoingRequest(Fields.from_list(headers))
outgoing_request.set_method(method)
outgoing_request.set_scheme(scheme)
- outgoing_request.set_authority(url_parsed.netloc)
+ if url_parsed.netloc is None:
+ if scheme == "http":
+ authority = ":80"
+ else:
+ authority = ":443"
+ else:
+ authority = url_parsed.netloc
+ outgoing_request.set_authority(authority)
path_and_query = url_parsed.path
if url_parsed.query:
path_and_query += '?' + url_parsed.query
BTW, if that doesn't work we might need to check if url_parsed.netloc
is non-None
and non-empty. And I guess I should probably just test it myself :)
Can you try this, please?
diff --git a/src/spin_sdk/http/__init__.py b/src/spin_sdk/http/__init__.py index dd98107..094d94a 100644 --- a/src/spin_sdk/http/__init__.py +++ b/src/spin_sdk/http/__init__.py @@ -174,7 +174,14 @@ async def send_async(request: Request) -> Response: outgoing_request = OutgoingRequest(Fields.from_list(headers)) outgoing_request.set_method(method) outgoing_request.set_scheme(scheme) - outgoing_request.set_authority(url_parsed.netloc) + if url_parsed.netloc is None: + if scheme == "http": + authority = ":80" + else: + authority = ":443" + else: + authority = url_parsed.netloc + outgoing_request.set_authority(authority) path_and_query = url_parsed.path if url_parsed.query: path_and_query += '?' + url_parsed.query
That did not work. I got the following error:
Authority is still nothing.
On changing it to:
if url_parsed.netloc == '':
if scheme == "http":
authority = ":80"
else:
authority = ":443"
else:
authority = url_parsed.netloc
I get:
The authority is now being set but a HttpProtocolError
Ok, thanks for trying. I'll start debugging this on my side.
Oh, I wonder if scheme
is empty or None
also. I'll check it out.
Oh, I wonder if
scheme
is empty orNone
also. I'll check it out.
scheme
is empty as well. url_parsed
is : ParseResult(scheme='', netloc='', path='/bookings/dwight_schrute', params='', query='', fragment='')
Yeah, so we'll need to force scheme
to "http"
in that case.
match url_parsed.scheme:
case "http":
scheme: Scheme = Scheme_Http()
case "https":
scheme = Scheme_Https()
case "":
scheme = Scheme_Http()
case _:
scheme = Scheme_Other(url_parsed.scheme)
if request.headers.get('content-length') is None:
content_length = len(request.body) if request.body is not None else 0
request.headers['content-length'] = str(content_length)
headers = list(map(
lambda pair: (pair[0], bytes(pair[1], "utf-8")),
request.headers.items()
))
outgoing_request = OutgoingRequest(Fields.from_list(headers))
outgoing_request.set_method(method)
outgoing_request.set_scheme(scheme)
if url_parsed.netloc == '':
if scheme == "http":
authority = ":80"
else:
authority = ":443"
else:
authority = url_parsed.netloc
outgoing_request.set_authority(authority)
This works!
I have tried to create a new branch to make a PR but it seems like I do not have permissions to create a new branch. Can I make this contribution? How could I go about doing that?
Sorry, I was in a meeting. You should be able to fork the repo, push your change to a branch of that fork, and then PR to the main repo from there. If that's a pain, I can do it it myself.
Will do, thanks!
Hi, I have a function in a router that receives a request and sends a request to another service.
I get the following error: It seems like
set_authority
is throwing an error. On looking at the source code of that function, it has not been implemented.Another thing is the error says line
883
and this is not at line 883. at line 883, the function is:On adding a few logs into
spin-python-sdk
, I see the following differences between this example and my code: Example:Output of my component:
I see that both the
scheme
andnetloc
fields are empty. On changing the outbound hosts to:from:
and changing the url to point to
http://127.0.0.1:3000/users/{username}/bookings
(the default url that spin exposes the application to), it seems to be working. On changing the url tohttp://self/users/{username}/bookings
it throws a DNS not found error, ashttps://self
is not a website that exists.