missing input validation leads to potential data exposure or unintended data manipulation

domcyrus commented 1 month ago

The go-fastly library doesn't validate most inputs, allowing a bad actor to misuse the library and potentially gain access to data that you do not want to expose.

This simple example demonstrates how the GetService call can be altered into a GetServiceDetails call by manipulating the input.

package main

import (
    "fmt"

    "github.com/fastly/go-fastly/v9/fastly"
)

func main() {
    fc, err := fastly.NewClient("$MY_FASTLY_API_KEY")
    if err != nil {
        panic(err)
    }

    // maliscous service id
    serviceID := "$MY_FASTLY_SERVICE_ID/details?"

    service, err := fc.GetService(&fastly.GetServiceInput{
        ServiceID: serviceID,
    })
    if err != nil {
        panic(err)
    }
    fmt.Printf("Service: %+v\n", service)
}

Note that the output of the code does indeed show that GetServiceDetails is called but errors on the JSON unmarshal. Still, the error does contain most of the service details.

This vulnerability could allow attackers to craft inputs that invoke unintended API calls, leading to potential data exposure or manipulation.

domcyrus commented 1 month ago

I believe the best solution would be to refactor the calls to RawRequest in a way that makes sure that path segments are safely escaped using url.PathEscape. This could be done by introducing a new function called SafeRawRequest with a different signature:

// SafeRawRequest accepts a verb, URL, and RequestOptions struct and returns the
// constructed http.Request and any errors that occurred.
func (c *Client) SafeRawRequest(verb string, pathSegments []string, ro *RequestOptions) (*http.Request, error) {
    // Ensure we have request options.
    if ro == nil {
        ro = new(RequestOptions)
    }

    safePaths := make([]string, len(pathSegments))
    for i, pathSegment := range pathSegments {
        safePaths[i] = url.PathEscape(pathSegment)
    }
    u := url.JoinPath(c.url.String(), safePaths...)

    // Existing logic for building and returning the request
    // ...
}

kpfleming commented 1 month ago

We're struggling a bit to understand the nature of the potential exposure here. In the first sentence you've used the terms 'bad actor' and 'you', and based on my reading the 'bad actor' is someone who has received a Fastly API token and 'you' is the owner of that token.

If that's the case, then the 'bad actor' already has access to all content that is available to that token, regardless of whether they use go-fastly properly or improperly.

Can you describe a real-world attack that is possible using this mechanism, which exposes data that isn't already visible to the attacker?

domcyrus commented 1 month ago

Thanks for your response. I understand that a valid API token already gives access to data, but the main concern is when inputs come from third-party sources or external systems. In such cases, untrusted inputs can be manipulated to change the API call, exposing more data than intended.

For example, if a service ID comes from a third party, they could alter it to switch from GetService to GetServiceDetails, potentially leaking sensitive information. This is only possible because inputs are not validated or sanitized.

I noticed that the library already uses url.PathEscape in some places but not consistently. Making this validation consistent across all input handling would reduce the risk of these issues.

I hope this explains the potential issue, otherwise please let me know.

kpfleming commented 3 weeks ago

If the 'service ID comes from a third party' but the API token does not come from the same third party, then there is already a significant risk of information disclosure because the third party could provide any SID they wish and if the API token has access to that SID then the third party will be able to obtain information about it.

The API library assumes that the user of the library either owns all of the data provided to it, or has done its own sanity checking before providing the data to the library. The uses of url.pathEscape that you see in the code are not for sanitization to protect against information disclosure, they are there because the data supplied may not be usable in a URL without escaping (for example the name of an item may contain whitespace or punctuation). SIDs do not need go to through url.pathEscape because they can never contain characters which require encoding before being included in a URL.

While we would certainly consider a PR to provide code that validates SIDs to be of the proper format, there's a bit of a 'slippery slope' issue, because as soon as we start down the road of input validation we're essentially committed to reviewing every code path and implementing validation everywhere. I think it's more prudent for us to just document (if necessary) that users of this API library, and all of our other API libraries, should not pass through untrusted input directly to the library. This is the same posture that SQL interface libraries generally take, and many other libraries as well: if the source of the data is not trusted, then the receiver of the data is obligated to validate it before passing it on.

domcyrus commented 3 weeks ago

I understand your point, but I still think input validation in the library is important. Here's why:

Extra Safety: Checking inputs at multiple levels (user code and library) gives better protection against mistakes and attacks.
Prevent Errors: You said SIDs shouldn't have certain characters. Checking this in the library stops problems before they start.
Easier for Developers: Built-in checks help developers find and fix issues quicker.
Consistent Handling: If the library does the checking, all users benefit from the same level of safety.
Good Security Practice: Many security experts suggest checking inputs at all levels of a program.
Low Cost: These checks need to get implemented once and everyone gets the benefits forever.

I understand you might be worried about having to add checks everywhere. Maybe we could start with just handling what we know is better, e.g., ensuring SIDs are alphanumeric. For example, we recently ran into an issue because we accidentally had a newline in the SID. This resulted in calling the wrong API endpoint and produced a confusing golang net.http error. If the library had checked for this, it would have caught the problem early and provided a clearer error message. Would you consider a targeted improvement like this as a starting point?

kpfleming commented 3 weeks ago

Absolutely, it would be quite welcome.

domcyrus commented 2 weeks ago

@kpfleming just a friendly ping that I've added pr #543

kpfleming commented 2 weeks ago

Yep, it's known... we've just been quite busy :-)

kpfleming commented 2 weeks ago

There should be a go-fastly release tomorrow, and a CLI release tomorrow or Monday.

kpfleming commented 2 weeks ago

After further discussion prompted by some additional API endpoints being added to the package, we've decided to revert #543 and implement a different solution in #544. That solution is not specific to service IDs and won't generate explicit errors when service IDs are not in a valid format.

Instead, what will happen if "1234/details" is passed as a service ID is that the resulting URL will look like GET /service/1234%2Fdetails. The Fastly API gateway will accept this URL, separate the components, perform URL decoding, and then attempt to find a service with the ID of "1234/details". Since that service cannot exist, a 4xx-type error will be returned.

domcyrus commented 2 weeks ago

Yes I think this is better which is what I initially proposed ;)

fastly / go-fastly

missing input validation leads to potential data exposure or unintended data manipulation #541