OpenAPITools / openapi-generator

OpenAPI Generator allows generation of API client libraries (SDK generation), server stubs, documentation and configuration automatically given an OpenAPI Spec (v2, v3)
https://openapi-generator.tech
Apache License 2.0
21.94k stars 6.59k forks source link

[BUG][Rust][client][reqwest] Wrong representation of binary body (octet-stream) in request/response #18117

Open DDtKey opened 8 months ago

DDtKey commented 8 months ago

Bug Report Checklist

Description

Currently any binaries in generate client for rust (with reqwest library at least) uses std::path::PathBuf. And several issues here:

E.g for request it generates the following code:

local_var_req_builder.json(&body);   // assumes it's json content

And for response it's even worse:

let local_var_content = local_var_resp.text().await?; // reads all content to `String`
// ... omitted 
serde_json::from_str(&local_var_content).map_err(Error::from) // it tries to de-serialize `PathBuf` from `String`
// moreover, headers not accessible, but it can be useful to check them before starting to consume the body
openapi-generator version

7.4.0/7.3.0 and likely earlier versions

OpenAPI declaration file content or url
openapi: 3.0.1
info:
  title: My title
  description: My description
  version: 1.0.0
paths:
  /upload:
    post:
      requestBody:
        required: true
        content:
          application/octet-stream:
            schema:
              type: string
              format: binary
      responses:
        '200':
          description: 'OK'
  /download:
    get:
      responses:
        '200':
          description: 'OK'
          content:
            application/octet-stream:
              schema:
                type: string
                format: binary
          headers: 
            Content-Length: 
              schema: 
                type: integer
            Content-Range:
              required: true
              schema:
                type: string
Generation Details

No special configs, just generate -i ./test.yaml -g rust --package-name stream-test -o ./out --library reqwest

Steps to reproduce
  1. Take provided openapi schema
  2. generate a client for rust using reqwest library
Related issues/PRs
Suggest a fix

First thought could be to use reqwest::Body - but it doesn't cover streaming of response (at least for reqwest 0.11.x). Only Response itself can be converted with bytes_stream. As well as into bytes.

But it's possible to wrap a Stream to reqwest::Body.

So I think it would be correct to go with this way:

Gaelik-git commented 6 months ago

What do you think about first making a MVP that would work with Vec<u8> ?

We could have the generated code looking like for the download ?

pub async fn download_get(configuration: &configuration::Configuration, ) -> Result<Vec<u8>, Error<DownloadGetError>> {
    let local_var_configuration = configuration;

    let local_var_client = &local_var_configuration.client;

    let local_var_uri_str = format!("{}/download", local_var_configuration.base_path);
    let mut local_var_req_builder = local_var_client.request(reqwest::Method::GET, local_var_uri_str.as_str());

    if let Some(ref local_var_user_agent) = local_var_configuration.user_agent {
        local_var_req_builder = local_var_req_builder.header(reqwest::header::USER_AGENT, local_var_user_agent.clone());
    }

    let local_var_req = local_var_req_builder.build()?;
    let local_var_resp = local_var_client.execute(local_var_req).await?;

    let local_var_status = local_var_resp.status();

    if !local_var_status.is_client_error() && !local_var_status.is_server_error() {
        let bytes = local_var_resp.bytes().await?;
        Ok(bytes.into_iter().collect())
    } else {
        let local_var_content = local_var_resp.text().await?;
        let local_var_entity: Option<DownloadGetError> = serde_json::from_str(&local_var_content).ok();
        let local_var_error = ResponseContent { status: local_var_status, content: local_var_content, entity: local_var_entity };
        Err(Error::ResponseError(local_var_error))
    }
}
DDtKey commented 6 months ago

That's definitely much better than current implementation - it will work at least. So it could be a first iteration, but doesn't solve the issue.

It worth mentioning, that it generates not really secure client - .bytes().await?; and .text().await?; may cause OOM if stream is large enough.

Also, starting with Vec<u8> will tie us to this for a while, and in order to support streaming we will need to either provide a configuration flag (to preserve compatibility) or claim as a braking change and release only with the next major version of openapi-generator

Gaelik-git commented 6 months ago

Wouldn't the text() return an error if the content of the stream is not valid utf-8 ?

Should we try to keep the same types for reqwest and hyper ? I think neither support this now

Also we have to make sure it also work when supportAsync is disabled ?

DDtKey commented 6 months ago

Wouldn't the text() return an error if the content of the stream is not valid utf-8 ?

Yes, but that could be just a valid utf-8 file(e.g CSV) or even compromised data. I'm just saying both methods (bytes & text) could cause OOM if they're called automatically.

Should we try to keep the same types for reqwest and hyper ? I think neither support this now

That's a good question, I do not think we have to, because these clients are in general operate with different types.

Also we have to make sure it also work when supportAsync is disabled ?

Yes, that's also fair point. And here we don't have many options, that's why returning reqwest::Response might be a good solution (user can decide what to use, Vec, Stream or copy_to). And, in general, similar approach is applicable for hyper.

I just find this more flexible, however it can be confusing why Response is returned after handling its possible error codes and etc. So probably some custom wrapper still makes sense

Gaelik-git commented 6 months ago

I get why we would like to return reqwest::Response, I tried it and it looks ok as it gives the user the ability to do what they want. I not very familiar with the project though, I had to change the typeMapping for the type file and binary to reqwest::Response

typeMapping.put("file", "reqwest::Response");
typeMapping.put("binary", "reqwest::Response");

The thing is that now this type is also expected when you want to send file or binary data

This is the produced code

pub async fn upload_post(configuration: &configuration::Configuration, body: reqwest::Response) -> Result<(), Error<UploadPostError>> {
    let local_var_configuration = configuration;

    let local_var_client = &local_var_configuration.client;

    let local_var_uri_str = format!("{}/upload", local_var_configuration.base_path);
    let mut local_var_req_builder = local_var_client.request(reqwest::Method::POST, local_var_uri_str.as_str());

    if let Some(ref local_var_user_agent) = local_var_configuration.user_agent {
        local_var_req_builder = local_var_req_builder.header(reqwest::header::USER_AGENT, local_var_user_agent.clone());
    }
    local_var_req_builder = local_var_req_builder.body(body);

    let local_var_req = local_var_req_builder.build()?;
    let local_var_resp = local_var_client.execute(local_var_req).await?;

    let local_var_status = local_var_resp.status();

    if !local_var_status.is_client_error() && !local_var_status.is_server_error() {
        Ok(())
    } else {
        let local_var_content = local_var_resp.text().await?;
        let local_var_entity: Option<UploadPostError> = serde_json::from_str(&local_var_content).ok();
        let local_var_error = ResponseContent { status: local_var_status, content: local_var_content, entity: local_var_entity };
        Err(Error::ResponseError(local_var_error))
    }
}

Response implements Into<Body> so it compiles but I am not sure how would a user create a response from an existing file on FS for example.

It probably means we need a custom type inside the crate (or an existing one), what do you think ?

DDtKey commented 6 months ago

The thing is that now this type is also expected when you want to send file or binary data

Yes, this one is tricky, for client side we need to have another type here, and I believe it should be Body. We could hardcode this in mustache by condition isFile

In fact, I'd prefer to return something like Body for responses too, but reqwest doesn't provide anything for that on Body level - I mean body can't be directly converted into stream.

But I think it's pretty good middle ground to go with Body for requests and Response for returning values

Gaelik-git commented 6 months ago

I think doing this modification will break the current multipart implementation as it uses the file method with the PathBuf input type

DDtKey commented 6 months ago

Yes, we also need to distinguish application/octet-stream from multipart/form-data

Example in the issue is about application/octet-stream

And AFAIK there is already a check for that: {{#isMultipart}}/{{^isMultipart}}: https://github.com/OpenAPITools/openapi-generator/blob/d7290b3b7b1d227a0010444d27daa616dbd3e364/modules/openapi-generator/src/main/resources/rust/reqwest/api.mustache#L245C8-L293

skrcka commented 3 weeks ago

Any progress regarding this issue?

It would be good to just return bytes of the incoming file or something

ThomasVille commented 2 weeks ago

@Gaelik-git I see you already started a PR. Is there any chance you finish it? Otherwise I can look into it.

Gaelik-git commented 2 weeks ago

@Gaelik-git I see you already started a PR. Is there any chance you finish it? Otherwise I can look into it.

Sorry I don't have time to work on this right now

ThomasVille commented 2 weeks ago

@DDtKey Hi, I created a new PR with the same code as @Gaelik-git to reproduce the errors he got and from what I can see, the main issue now is that, when setting the typeMapping of file to reqwest::Body, it updates both the upload types and the download types.

Do you have any tips regarding how to use a different type for the upload and download types?

DDtKey commented 2 weeks ago

I think we have an option to control this on the level of mustache templates at least 🤔 , with checks like isResponseFile or isFile (don't remember exactly what do we have) For now it sounds like a quick possible solution

Probably there is a better way, but I will need to recall & double check the current structure

ThomasVille commented 2 weeks ago

@DDtKey Thanks for the quick reply!

Actually, I just noticed that that's what @Gaelik-git did and it worked in petstore but not in other instances like in petstore-async and in petstore-avoid-box.

When it fails, it looks like we enter the case where supportMultipleResponses is true but the second response is UnknownValue(serde_json::Value).

I'll investigate these differences.

ThomasVille commented 2 weeks ago

supportMultipleResponses is false by default but can be enabled depending on the sample like here.

So we need to handle the situation where supportMultipleResponses is true.

However, putting reqwest::Response in the response enum has two issues:

error[E0277]: the trait bound reqwest::Response: Clone is not satisfied --> reqwest/petstore-async/src/apis/testing_api.rs:22:15 19 #[derive(Debug, Clone, Serialize, Deserialize)] ----- in this derive macro expansion ... 22 Status200(reqwest::Response), ^^^^^^^^^^^^^^^^^ the trait Clone is not implemented for reqwest::Response

= note: this error originates in the derive macro Clone (in Nightly builds, run with -Z macro-backtrace for more info)



So I'm wondering if we really want to support returning a [response enum](https://github.com/OpenAPITools/openapi-generator/blob/69fb82ce1d9eee609a8681ee06917ec39af653d8/modules/openapi-generator/src/main/resources/rust/reqwest/api.mustache#L36-L46) when the endpoint returns a file, even if `supportMultipleResponses` is `true`. Maybe it doesn't make sense at all?
ThomasVille commented 2 weeks ago

@DDtKey For the moment I have implemented the following changes in https://github.com/OpenAPITools/openapi-generator/pull/20031:

Do you think this approach is okay?

DDtKey commented 4 days ago

Hi @ThomasVille, Sorry for the delay with reply.

I find this reasonable at this stage, because currently it doesn't work at all and we have some working compromise. I guess it's still open question for hyper-based clients, but it can be implemented separately and I see no reasons to block these changes.

Personally, I vote for accepting this approach as first step forward.

Thank you!