juhaku / utoipa

Simple, Fast, Code first and Compile time generated OpenAPI documentation for Rust
Apache License 2.0
2.45k stars 188 forks source link

The default / example schema for `Vec<u8>` is `"string"` #570

Closed johnperry-math closed 3 weeks ago

johnperry-math commented 1 year ago

Version

latest

Description

If a field has type Vec<u8> then #[derive(ToSchema)] produces a description of "string" instead of an array of integers.

Expected behavior

The generated schema should be an array of integers.

MWE

Add the field icon: Vec<u8>, to struct Todo. Everything runs and compiles happily, but the generated schema is

icon*     string($binary)

and the example gives

    "icon": "string",

Workaround

Provide a default and example schema for icon as json!(vec![0_u8, 255]).

juhaku commented 1 year ago

This is actually by design because list of u8s is the way raw bytes are represented, so utoipa makes assumption that user want's to return binary data e.g. octet-stream.

We could add attribute to consider the type strictly as vec of numbers that users can add over their field e.g.

#[derive(ToSchema)]
struct Foo {
    #[schema(strict)]
    value: Vec<u8>,
}
johnperry-math commented 1 year ago

I don't know if we understand each other. I do in fact want binary data (vec of u8). The problem is that the schema represents it as a string.

If I ignore the first six words of your first sentence, it sounds like we have the same goal... but those first six words imply we don't. Can you elaborate?

juhaku commented 1 year ago

Oh sorry, to be more clear the behavior of interpreting Vec<u8> and slices as well as string is based on this: https://swagger.io/docs/specification/describing-request-body/file-upload/

jayvdb commented 1 year ago

Adding a bit more detail, the following is the best openapi has for binary stream

type: string
format: binary

It isnt valid JSON Schema as far as I know. This was added in https://github.com/juhaku/utoipa/issues/197

johnperry-math commented 1 year ago

From the two comments above, I gather that the current behavior is the expected behavior, so the workaround where I set an example works fine, at least for now. A strict option seems cleaner than the current workaround, especially if it would be useful in other circumstances.

My unfamiliarity with this may be getting in the way. To be clear: I need the schema both for a response and a request body in Json format, and that's where I stumbled.

juhaku commented 1 year ago

@jayvdb Thanks for adding context

@johnperry-math Yes this is expected behavior.

Yeah, perhaps it is good to have such attribute. Though I need to check whether you could already workaround this issue with value_type = ... attribute declaration.

#[derive(ToSchema)]
struct Foo {
    #[schema(value_type = Vec<u8>)] // <-- This might be able to make the schema behave 
    value: Vec<u8>,                 // as Vec of bytes (numbers)
}
johnperry-math commented 1 year ago

@juhaku Alas, value_type = Vec<u8> doesn't seem to work for me.

julius-boettger commented 11 months ago

I am also experiencing this issue.

leelhn2345 commented 4 months ago
#[derive(ToSchema, TryFromMultipart)]
pub struct MediaUpload {
    name: String,
    #[schema(value_type = Vec<Vec<u8>>)]
    media: Vec<FieldData<Bytes>>,
}
#[utoipa::path(
    post,
    path = "/media",
    request_body(content_type = "multipart/form-data", content = MediaUpload),
    responses(
        (status = 200, description = "media uploaded")
    )
)]
#[tracing::instrument(skip_all)]
pub async fn send_tele_media(TypedMultipart(data): TypedMultipart<MediaUpload>) {
    let media = &data.media.first().unwrap().metadata;
    println!("{media:#?}");
    let file_size = data.media.first().unwrap().contents.len();
    println!("size is {file_size}");
}

This works for me. TryFromMultipart trait and FieldData struct is from axum_typed_multipart. I like it better than the default axum multipart extractor.

I'm guessing you weren't able to upload binary because you didn't specify content_type = "multipart/form-data" under your request_body. I could be wrong about this.

SZenglein commented 3 months ago

This behavior is wrong when not using serde_bytes.

The actually serialized struct does contain a list of numbers, not a byte string. So the documentation is wrong. When using serde_bytes, the string representation is better.

The serde_bytes documentation even mentions that serde cannot treat [u8] any different from other slices due to specialization.

juhaku commented 3 weeks ago

Since the utoipa 5.0.0 the Vec<u8> as String behavior will not be there anymore. Here is PR #1113 which adds some examples for file uploads in utoipa 5.0.0 which will only use OpenAPI 3.0.0.

From utoipa 5.0.0 onwards the Vec<u8> is treated as Vec<u8> which is array of ints.