juhaku / utoipa

Simple, Fast, Code first and Compile time generated OpenAPI documentation for Rust
Apache License 2.0
2.23k stars 173 forks source link

Idea: A "strict mode" for component schemas #890

Open ranger-ross opened 5 months ago

ranger-ross commented 5 months ago

Hello, I have been using this crate in my Actix project and its an awesome tool. One thing that I have been having issues with is forgetting to put my request/response structs in the components(schemas()) section of the matcro.

If I forget to add them, my project compiles just fine but I get errors in Swagger like this:

Resolver error at paths./auth/create-reset-password-link.post.requestBody.content.application/json.schema.$ref
Could not resolve reference: Could not resolve pointer: /components/schemas/CreatePasswordResetLink does not exist in document

I am wondering if perhaps we can introduce a strict mode that fails to compile if a schema is missing. :thinking:

Let me know what you think. If you think its a good idea, I could probably try opening a PR.

ranger-ross commented 5 months ago

So I was able to create a unit test verify this. Its probably missing some edge cases when following references but works for simple use-cases (mine)

#[derive(OpenApi)]
#[openapi(
    paths(
        example,
    ),
    components(
        schemas(
            ExampleResponse, 
        )
    )
)]
pub struct ApiDoc;

#[test]
fn require_all_schemas_to_be_added_to_openapi_spec() {
    let openapi = ApiDoc::openapi();

    let binding = openapi.components.unwrap_or_default();
    let schemas = binding.schemas;
    let paths = openapi.paths.paths;

    for (path, path_item) in paths {
        for (_path_type, operation) in path_item.operations {
            if let Some(request_body) = operation.request_body {
                for (_content_type, content) in request_body.content {
                    if let RefOr::Ref(r) = content.schema {
                        verify_ref_location(&r.ref_location, &path, &schemas);
                    }
                }
            }

            for (_status, response) in operation.responses.responses {
                if let RefOr::T(res) = response {
                    for (_content_type, content) in res.content {
                        if let RefOr::Ref(r) = content.schema {
                            verify_ref_location(&r.ref_location, &path, &schemas);
                        }
                    }
                }
            }
        }
    }
}

fn fail_missing_schema(schema_name: &str, path: &str) {
    panic!("Could not find `{schema_name}` in #/components/schemas for `{path}`. Did you forget to add it to #[openapi(...)] macro?");
}

fn verify_ref_location(
    ref_location: &str,
    path: &str,
    schemas: &BTreeMap<String, RefOr<Schema>>,
) {
    let prefix = "#/components/schemas/";

    if ref_location.starts_with(prefix) {
        let schema_name = &ref_location[prefix.len()..];
        if !schemas.contains_key(schema_name) {
            fail_missing_schema(schema_name, &path);
        }
    } else {
        eprintln!(
            "[WARN] Unexpected location {ref_location} (not part of #/components/schemas)"
        );
    }
}
juhaku commented 4 months ago

Yes the idea is good, and probably should be somehow addressed in future.

arifd commented 1 month ago

Good stuff @ranger-ross

I took your code and improved it to try to catch all cases:

#[test]
fn verify_openapi_schema_declarations() {
    use std::collections::{BTreeMap, HashSet};

    use utoipa::openapi::{schema::AnyOf, AllOf, Array, OneOf, Ref, RefOr, Schema};

    fn verify_schema(
        path: &str,
        schemas: &BTreeMap<String, RefOr<Schema>>,
        schema: &RefOr<Schema>,
        visited: &mut HashSet<String>,
    ) {
        /// Visits a RefOr<Schema> and applies a visitor to any underlying
        /// &Refs.
        fn visit_schema_refs(schema: &RefOr<Schema>, f: &mut impl FnMut(&Ref)) {
            match schema {
                RefOr::Ref(r) => f(r),
                RefOr::T(Schema::AllOf(AllOf { items, .. }))
                | RefOr::T(Schema::AnyOf(AnyOf { items, .. }))
                | RefOr::T(Schema::OneOf(OneOf { items, .. })) => {
                    for item in items.iter().rev() {
                        visit_schema_refs(item, f);
                    }
                }
                RefOr::T(Schema::Array(Array { items, .. })) => visit_schema_refs(items, f),
                RefOr::T(Schema::Object(object)) => {
                    for property in object.properties.values().rev() {
                        visit_schema_refs(property, f);
                    }
                }
                RefOr::T(_) => unreachable!("all variants expressed"),
            }
        }

        fn verify_schema_ref(
            ref_location: &str,
            path: &str,
            schemas: &BTreeMap<String, RefOr<Schema>>,
            visited: &mut HashSet<String>,
        ) {
            let prefix = "#/components/schemas/";
            let schema = ref_location
                .strip_prefix(prefix)
                .expect("unknown schema prefix");

            if !visited.insert(schema.to_string()) {
                return;
            }

            match schemas.get(schema) {
                // Ensure any references inside the schema are also valid:
                Some(schema) => visit_schema_refs(schema, &mut |r| {
                    verify_schema_ref(&r.ref_location, path, schemas, visited);
                }),
                None => {
                    panic!("`{path}` depends on `{schema}`, but it is not in the schema")
                }
            }
        }

        visit_schema_refs(schema, &mut |r| {
            verify_schema_ref(&r.ref_location, path, schemas, visited)
        });
    }

    let openapi = // Your OpenApi struct goes here.
    let schemas = openapi.components.unwrap().schemas;
    let paths = openapi.paths.paths;
    let mut visited = HashSet::new();

    for (path, path_item) in paths {
        for (_, operation) in path_item.operations {
            if let Some(request_body) = operation.request_body {
                for (_, content) in request_body.content {
                    verify_schema(&path, &schemas, &content.schema, &mut visited);
                }
            }

            for (_, response) in operation.responses.responses {
                if let RefOr::T(res) = response {
                    for (_, content) in res.content {
                        verify_schema(&path, &schemas, &content.schema, &mut visited);
                    }
                }
            }
        }
    }

    // These are not referenced by any paths:
    dbg!(schemas
        .into_keys()
        .filter(|schema| !visited.contains(schema))
        .collect::<Vec<_>>());
}
leebenson commented 20 hours ago

This is a great idea and honestly, I think it's likely to surprise people that this isn't how utoipa currently works. This, for example, compiles just fine but blows up at runtime:

#[utoipa::path(
    method(get),
    path = "/test",
    responses(
        (status = OK, description = "Hello", body = DoesNotExist)
    )
)]
async fn test() -> &'static str {
    "test"
}

My naive assumption would be that body references a ToSchema type in scope, and would panic at compile time if not found. In practice, this appears to be an arbitrary string that happily generates a non-existent definition:

{
  "openapi": "3.1.0",
  "info": {
    "title": "server",
    "description": "",
    "license": {
      "name": ""
    },
    "version": "0.1.0"
  },
  "paths": {
    "/test": {
      "get": {
        "operationId": "test",
        "responses": {
          "200": {
            "description": "Hello",
            "content": {
              "application/json": {
                "schema": {
                  "$ref": "#/components/schemas/DoesNotExist"
                }
              }
            }
          }
        }
      }
    }
  }
}

I know there's work in progress to auto-generate schema types and reduce some of the boilerplate (particularly with the Axum integration, which is great!).

I'm wondering if you're considering extending this to a full proc macro that could wrap (in my case, Axum) functions and parse parameters and return types, and do full end-to-end schema generation? That's effectively what Poem OpenAPI does and it seems like a plausible way to avoid any/all boilerplate.

Is something like that on the cards? Anything we can do to eliminate runtime errors and stick with the Rust mantra of compile time checks, the better!

juhaku commented 1 minute ago
#[utoipa::path(
   method(get),
   path = "/test",
   responses(
       (status = OK, description = "Hello", body = DoesNotExist)
   )
)]
async fn test() -> &'static str {
   "test"
}

In 5.0.0 this will fail with compile error, since the bodies for requests and responses are correctly spanned. Thus also they will be renameable for example if the is renamed it will get renamed to the #[utoipa::path(...)] as well.

I'm wondering if you're considering extending this to a full proc macro that could wrap (in my case, Axum) functions and parse parameters and return types, and do full end-to-end schema generation?

That is a quite hard thing to do and would really require heavy understanding of a specific framework if I understand this question correctly. I actually tried to do something like that but it ended up being nearly impossible before I went with the current style of utoipa-axum OpenApiRouter and routes! macro. I got it working for some cases but because of how axum implements the Handler trait for all the tuples and other types as well, it would have been necessary for me to also implement them to all the same types as axum has implemented in order to fully integrate the axum Handler trait with utoipa specific logic.

Nonetheless not sure whether this strict mode is actually necessary since in the coming release missing types are not possible. But it is still possible to have completely missing responses as return types are not inferred.