httpwg / http-extensions

HTTP Extensions in progress
https://httpwg.org/http-extensions/
445 stars 146 forks source link

Divide and conquer with the existing is a clean alternative to the proposed HTTP QUERY method #2904

Closed rafageist closed 1 month ago

rafageist commented 1 month ago

Instead of introducing the QUERY method for complex queries, I propose a divide-and-conquer strategy using existing methods. This allows handling complex payloads without introducing new HTTP methods.

[!IMPORTANT] Complex queries are intended for sophisticated backend systems capable of processing and storing such requests. Both the proposed QUERY method and my example are meant for advanced scenarios, not simple use cases. So we are talking about the same context, the same problem and different solutions.

Here is the approach with a hypothetical example:

[!WARNING] You don't need to make 2 requests for each query, the first time is enough if you get creative. The example sends a query template that dynamically generates a URL for subsequent queries, allowing you to reuse the query results efficiently.

Tell the server that you need to make a complex query and receive a response that your query was registered.

You can optionally tell it which path you want to consume or return a UUID.

Request:

POST /query
{
  "desiredUrl": "/productos/stock/dell-apple-hp/laptops-500-1500/ratings-4plus-ram-8GB-16GB/{{page}}",
  "page": "{{page}}"
  "filters": {
    "category": "electronics",
    "subCategory": "laptops",
    "priceRange": {
      "min": 500,
      "max": 1500
    },
    "availability": "in_stock",
    "brands": ["Dell", "Apple", "HP"],
    "sort": {
      "field": "ratings",
      "order": "desc"
    },
    "attributes": {
      "screenSize": ["13-inch", "15-inch"],
      "processorType": ["Intel", "AMD"],
      "ram": ["8GB", "16GB"]
    }
  }
}

Response:

Content-type: application/json

{
  "query_uuid": "89fbf974-4565-4bf6-8e9e-e1fd8585a0dc"
}

Ask the server about the results of your request

Request:

GET /productos/stock/dell-apple-hp/laptops-500-1500/ratings-4plus-ram-8GB-16GB/1

Response:

Content-type: application/json

{
  "totalPages": 100,
  "page": 1,
  "products": [
    {
      "id": 101,
      "name": "Dell XPS 13",
      "category": "laptops",
      "price": 1200,
      "availability": "in_stock",
      "brand": "Dell",
      "rating": 4.5,
      "attributes": {
        "screenSize": "13-inch",
        "processorType": "Intel Core i7",
        "ram": "16GB",
        "storage": "512GB SSD"
      }
    },
    {
      "id": 102,
      "name": "Apple MacBook Air",
      "category": "laptops",
      "price": 1500,
      "availability": "in_stock",
      "brand": "Apple",
      "rating": 4.7,
      "attributes": {
        "screenSize": "13-inch",
        "processorType": "M1",
        "ram": "16GB",
        "storage": "512GB SSD"
      }
    },
    {
      "id": 103,
      "name": "HP Spectre x360",
      "category": "laptops",
      "price": 1400,
      "availability": "in_stock",
      "brand": "HP",
      "rating": 4.3,
      "attributes": {
        "screenSize": "15-inch",
        "processorType": "Intel Core i5",
        "ram": "8GB",
        "storage": "256GB SSD"
      }
    }
  ]
}

I hope these examples have clarified how complex queries can be effectively handled using existing methods. This method offers flexibility to handle sophisticated queries on advanced backends, requests that can be pre-processed, optimized, cached, return an early error, reused, etc.

I welcome any comments and feedback from the community. Thank you for considering my idea and I hope it has contributed to the ongoing discussion.

reschke commented 1 month ago

From a quick glace, this misses the point of QUERY - having a method that can pass data in the request body which is "safe".

If "safeness" of the request is irrelevant, then yes, you can POST.

rafageist commented 1 month ago

@reschke Thanks for the feedback! I understand the idea of "safe" in QUERY, as in GET. However, I think complex queries in the body of a request could be a symptom of poor design. HTTP methods are designed for simple, modular tasks, allowing them to be combined to solve complex problems. Creating a new method like QUERY introduces unnecessary complexity and can dilute the clarity of current methods.

If approved, we could see bad practices and overloading servers with queries that should be optimized in the architecture. It could also open the door to hundreds of new methods, turning the protocol into a full application layer. This risks breaking the balance and simplicity that HTTP was built on.

Existing HTTP methods confirm to the Unix principle of doing one thing well, while QUERY risks breaking that principle by being too flexible.

rafageist commented 1 month ago

From a quick glace, this misses the point of QUERY - having a method that can pass data in the request body which is "safe".

If "safeness" of the request is irrelevant, then yes, you can POST.

@reschke Re-reading your comment made me realize something important.

The "point of" QUERY cannot simply be the creation of the method itself. There must be a stronger, more meaningful reason behind its existence, something that provides unique value that cannot be achieved with current HTTP methods or through architectural improvements. Creating a new method just because you want to combine features of GET and POST is not a sufficient reason. The purpose of QUERY should clearly answer why it is needed and what its real value is.

When designing something in software engineering, or any other technology, we typically ask ourselves three questions:

In this case, while QUERY answers the “what” and the “how,” I think the “why” is still unclear. What real benefit does QUERY provide that can’t be solved by better architecture or existing methods like POST or GET?

If the purpose of QUERY is to allow more complex queries by sending a body in the request (it is possible now, but not official), a simpler solution would be to simply allow bodies in GET within the HTTP specification, rather than introducing a new method. This would keep the semantics clear, avoiding the need to create a new method that could cause confusion and bad practices. In the end, the question is whether we really need a new method or whether we can adjust the existing rules to achieve the same goal.

Now ask yourself why no body has been specified for the GET?

The designers of the HTTP protocol originally decided not to allow bodies in GET because its purpose is to request resources without altering the state of the server, and bodies were considered unnecessary for that. By keeping GET bodyless, the simplicity and clarity of the protocol is preserved, avoiding complications with caching semantics and idempotence. However, it is unclear whether, in the current context, this decision could be revisited, as allowing bodies in GET could solve many problems.

reschke commented 1 month ago

That topic has been discussed multiple times in the past, and the result always was: don't. See https://www.rfc-editor.org/rfc/rfc9110.html#section-9.3.1-6.

As stated earlier, that's the reason why QUERY is being defined.

rafageist commented 1 month ago

That topic has been discussed multiple times in the past, and the result always was: don't. See https://www.rfc-editor.org/rfc/rfc9110.html#section-9.3.1-6.

As stated earlier, that's the reason why QUERY is being defined.

@reschke The HTTP protocol (RFC 9110) specifies that the GET method refers to a URI, and the URI specification (RFC 3986) does not impose a size limit on these identifiers. Therefore, any limitation on the length of parameters in the URL is not a deficiency of the protocol, but of the intermediate implementations (browsers, servers, proxies). Rather than changing the protocol with methods like QUERY, it would be more logical to optimize the ecosystem to handle longer query parameters or implement better solutions as this issue proposes.

[!IMPORTANT] The HTTP QUERY proposal attempts to solve a problem in the protocol that is not part of the protocol, when it is an ecosystem problem and must be solved in the ecosystem.

Therefore, the topic you reference reinforces my point even more. The reason for not allowing a body in GET is well-established, and creating a method like QUERY seems forced. You’ve dismissed the rest of the arguments, focusing only on the lack of a body in GET, but haven't provided a solid reason why QUERY is needed and the proposal is vague in its rationale.

The only justification seems to be the risk of URL truncation, which is not a deficiency of the HTTP protocol itself, but of the points along the request path and the ingenuity of developers in structuring URLs and optimizing the systems they implement.

The HTTP QUERY proposal brings more problems than solutions to this world:

An article will be written and published recommending the solution given in this issue and the concerns regarding the HTTP QUERY proposal. All those possible problems will be detailed.

Thanks for your time.

martinthomson commented 1 month ago

FWIW, I disagree with this characterization in its entirety. Making the safe-ness of a request visible to intermediaries is justification enough to use QUERY over a POST, even if a POST could carry most of the same semantics if a resource were configured that way.

MikeBishop commented 1 month ago

@rafageist, I don't believe QUERY brings any of those things -- they already exist in the world. QUERY fills a gap between GET and POST for which GET-with-body and POST-but-actually-safe are currently being used as workarounds in various situations.

Your argument that everyone could just support infinitely long URIs for complex query params is true, but doesn't make the definition of this method less useful. They could have done that anyway -- instead, they've used POST, because sending structured bodies is useful.

Might I suggest that since you disagree with the premise of this work, the proper venue is the mailing list rather than an issue on the document? You're essentially arguing that the working group should un-adopt the current draft and perhaps adopt an alternative that is not yet published.

rafageist commented 1 month ago

Thanks for your feedback.

While structured bodies can be useful, I believe the current issue can still be addressed by using existing methods like POST & GET combined with multiple requests if needed. We've seen 'gaps' between methods like PUT and PATCH, or HEAD and GET, and rather than creating new methods, we combine or layer requests as necessary. Solving everything in a single request isn’t always essential, especially if a combination of methods can achieve the same outcome more effectively.

For example, GraphQL uses POST for everything because it needs to send the query in the body, but that doesn't mean that the whole body needs to be sent multiple times. Maybe you're only interested in changing small parts.

image

GraphQL could use the HTTP protocol well, sending a POST with a query-template first (or every time you need to change the template) to prepare future requests and then use GET for subsequent responses. It occurs to me that templates can be reused between different clients.

image

This approach takes advantage of the flexibility of multiple requests instead of relying on a single new method like QUERY. The focus of this issue is to break the pattern of wanting to solve the above in a single request.

And this is just one of the solutions I can think of, there may be more that ARE NOT a GET with a body or a secure POST. But quite the opposite, a GET and a POST working together and respecting HTTP.

In a general sense, I have thought of a proposal that says there is no solution for the problem it aims to solve, when in fact it can be solved respecting HTTP.

If what is desired is to solve it in a single request with the existing methods, then maybe that is not possible right now. But the fact that it cannot be solved now in a single request does not mean that your architecture is correct, and the protocol is incomplete.

It is possible that if it cannot be solved that way, then something is wrong with your architecture. And that is the central point of this issue.