BramComyn commented 1 month ago

I will start looking for more possible security vulnerabilities that can occur when performing normal GET-requests during resource querying.

BramComyn commented 1 month ago

Currently, I have remembered/though of:

response bodies that are too large;
response bodies that take too long to reach the client;
malicious redirects;
malicious JSON-LD contexts;
... (to be updated)

BramComyn commented 1 month ago

Progress 2024-09-03

Response bodies that are too large

My suggestion would be to allow for setting an expectation on the size of the response, or checking via a HEAD-request whether or not there is a content-length header. In case of streaming data, I would suggest reading chunk per chunk and checking whether or not it is still profitable to keep fetching from that resource. This could be done in the same way by setting an expectation, limit, ... of the resource size.

This is, in general, the problem I have been focusing on. I am sure that this should not be a problem for non-streaming data over HTTP/1.1 and suspect that this cannot pose a large problem for non-streaming data over HTTP/2.0.

Response bodies that take too long to reach the client

My intuition says that we could make use of the built-in time-outs, but those only count time where the session or stream is inactive. This means that there is no built-in time-out option for a stream or session where a 1GB-sized file is being transferred at one bps.

If we interpret taking too long to reach the client as it takes too long for the first bit to reach the client, then the built-in time-outs already handle this for us.

Malicious redirects

One possible option that I see is to limit the scope of the followed redirects. For example, we could choose not to follow a redirect if it leads to a different domain or an IP-address not associated to the current domain. This does however pose a problem in finding additional contexts in RDF-data.

BramComyn commented 1 month ago

Closing for now. Will probably come back later.

BramComyn / safeguard-fetch

Search for more possible attack vectors in querying resources #10

Progress 2024-09-03

Response bodies that are too large

Response bodies that take too long to reach the client

Malicious redirects