spec-first / connexion

Connexion is a modern Python web framework that makes spec-first and api-first development easy.
https://connexion.readthedocs.io/en/latest/
Apache License 2.0
4.49k stars 765 forks source link

Array-parameters with string-items are falsely split on every comma #1905

Open microrache opened 7 months ago

microrache commented 7 months ago

Description

When I specify an array-parameter with string-items by using the default serialization-options (style=form, explode=true), the submitted strings are split on every occurence of comma before passed into my operation.

Actual behaviour

A submitted array ["a, b", "c"] is passed as ["a", "b", "c"] into the handler-method.

Expected behaviour

The strings remain untouched and the array is passed as list-object with length = 2 and the values "a, b" and "c".

Steps to reproduce

Create a Connexion-app with v3.0.6 and load the following spec:

openapi: 3.0.3
info:
  title: Array-Split-Bug
  version: "1.0"

paths:
  /foo:
    get:
      parameters:
      - name: mylist
        in: query
        schema:
          type: array
          items:
            type: string
        example:
        - Lorem ipsum dolor sit amet, consetetur sadipscing elitr, sed diam nonumy eirmod tempor invidunt ut labore et dolore magna aliquyam erat, sed diam voluptua.
        - Stet clita kasd gubergren, no sea takimata sanctus est Lorem ipsum dolor sit amet.
        - Duis autem vel eum iriure dolor in hendrerit in vulputate velit esse molestie consequat, vel illum dolore eu feugiat nulla facilisis at vero eros et accumsan et iusto odio dignissim qui blandit praesent luptatum zzril delenit augue duis dolore te feugait nulla facilisi.
      responses:
        200:
          description: successful operation

Create a method which simply returns the submitted list mylist. Start Swagger UI and submit the 3 provided example-values. The response will be an array consisting of 8 entries instead of the submitted 3 values: image

Additional info:

microrache commented 6 months ago

Meanwhile I did a little debugging which brought me to an experiment: I changed the value for QUERY_STRING_DELIMITERS['form'] in the file uri_parsing.py from "," (comma) to a less common/likely string like "|||" (three pipes) and could observe that the erroneously split disappeared. But to be honest, I am not very experienced with elaborate testing nor with all the multiple possible query-formats, so I cannot overlook the consequences of such a modification.

But digging through the code raised a question for me: I don't really understand, why at all - speaking for the form-request-style - in the process of URI-parsing the parameters-array is joined to a string first, just to be exploded back into an array a little later. Could anyone explain to me what's the reason behind this?

microrache commented 6 months ago

For anyone running into this issue, here is a workaround by defining a custom URI-parser-class within our app. We abstract the OpenAPIURIParser and re-define the delimiter by re-assigning a value to the delimiter-dictionary (which is very bad practice as this is supposed to be a constant):

from connexion.uri_parsing import OpenAPIURIParser, QUERY_STRING_DELIMITERS

class OpenAPIURIParserMod(OpenAPIURIParser):

  def __init__(self, param_defns, body_defn):
    super().__init__(param_defns, body_defn)
    QUERY_STRING_DELIMITERS['form'] = '|||'

and in app.py:

from .parsers.OpenAPIURIParserMod import OpenAPIURIParserMod

app = AsyncApp(
  __name__,
  # any other options
  uri_parser_class=OpenAPIURIParserMod,
)
nmoreaud commented 2 months ago

We also have this problem

nmoreaud commented 1 month ago

Do you plan to fix this bug?

cipold commented 6 days ago

We stumbled over the same problem. It is not possible to correctly parse strings that possibly contain commas without modifying the delimiter. This is true even if the commas in the strings are properly encoded.

The reason is that in _resolve_param_duplicates OpenAPIURIParser handles the already url decoded values and joins them with the delimiter (here: comma) to a new string that is then processed further. The previously available array information is discarded.

Here an example for the URL http://127.0.0.1:55155/myapi/demo?my_array=a&my_array=b&my_array=c%2Cd with the endpoint definition:

  /demo:
    get:
      tags:
        - "demo"
      parameters:
        - name: my_array
          in: query
          required: true
          schema:
            type: array
            items:
              type: string
      responses:
        '204':
          description: OK
      operationId: demo_endpoint

image