eclipse-apoapsis / ort-server

A scalable server implementation of the OSS Review Toolkit.
https://eclipse-apoapsis.github.io/ort-server/
Apache License 2.0
19 stars 8 forks source link

Having parenthesis in product name leads to Bad Request with PATCH #19

Closed Etsija closed 6 months ago

Etsija commented 7 months ago

ort-server-core-1 | 2024-04-09 07:05:38.888 [eventLoopGroupProxy-4-14] level=INFO Application - 400 Bad Request: PATCH - /api/v1/products/1 in 8ms

oheger-bosch commented 7 months ago

This behavior is caused by the validation implemented for entity names, which is rather strict at the moment. The constraints enforced for entity names can be found here: https://github.com/eclipse-apoapsis/ort-server/blob/main/api/v1/model/src/commonMain/kotlin/validation/Constraints.kt

The error is not restricted to PATCH requests; you should not be able to create a new product with characters considered illegal in its name either.

Etsija commented 7 months ago

Thank you for the explanation. Maybe there could be something in the Swagger UI about the constraints? We probably need to do the same validation in the ORT Server UI.

mmurto commented 7 months ago

The validation would be great to be visible in the OpenAPI spec as patterns.

sschuberth commented 7 months ago

you should not be able to create a new product with characters considered illegal in its name either.

@oheger-bosch, what's the rationale for those characters being illegal? In seems inconvenient from a user perspective to not even be allowed to use "normal special characters" that could appear in a regular sentence, like parentheses, dots, slashes etc.

oheger-bosch commented 7 months ago

@sschuberth, the need to have some kind of validation came up with some endpoints that use entity names as path parameters, e.g. to manipulate secrets or infrastructure services. The guy who implemented the task was then a bit over-eager. Personally, I would have no objections to make the checks less restrictive to allow at least a wider range of special characters.

sschuberth commented 7 months ago

I would have no objections to make the checks less restrictive

Would it also be an option to basically allow any special character but percent-encode these for usage in paths similar to like ORT core does?

sschuberth commented 7 months ago

@Etsija I believe it makes sense to reopen this until we have a better solution to the problem implemented, either by making the constraints more lenient as @oheger-bosch mentioned, or by not having these constraints at all by using some encoding / escaping like I've proposed.

sschuberth commented 7 months ago

The guy who implemented the task was then a bit over-eager.

Heh 😉

@yarosevych, could you maybe explain where the motivation for the restrictions implemented in 9478d39e830df109e0a6a04c1c0cf90f68ad19b6 and following comes from? Was there a concrete problem with allowing more / arbitrary characters?

oheger-bosch commented 7 months ago

Would it also be an option to basically allow any special character but percent-encode these for usage in paths similar to like ORT core does?

I am not sure whether this would require changes on our side. Since we only provide a REST API currently, users would have to do proper URL encoding when calling our API. I assume that Ktor then returns the correct, de-coded values when queried for call parameters (but would have to check this).

Nevertheless, a certain level of validation would probably still be good, e.g. to prevent stuff like leading or trailing whitespace or really suspicious characters (like used in SQL injection attacks). Also, our firewall is sometimes quite picky.

Fun fact: When we did a PEN test, we received a minor finding from our (Bosch-internal) testers that is was possible to use SQL keywords in entity names. In their opinion, an organization should not be called "Drop Table Users".

mnonnenmacher commented 7 months ago

@yarosevych, could you maybe explain where the motivation for the restrictions implemented in 9478d39 and following comes from? Was there a concrete problem with allowing more / arbitrary characters?

One idea was to redesign the API to use entity names instead of numeric IDs and have a structure similar to the one of GitHub. This imposes stricter requirements to ensure proper URLs (basically something like [\w\.-]*). For example, instead of api/v1/organizations/1 or api/v1/products/2 the endpoints would look like api/v1/organization-name or api/v1/organization-name/product-name.

Potential benefits would be that it would be harder to guess the endpoints for entities you have no access to, and that it is easier to remember the names of your entities than their numeric ids. However, there is no agreement yet if this change will be implemented or not. But we plan to make a review of the API as there are also other inconsistencies currently that need to be fixed before it can be considered stable, so any input is welcome.

mmurto commented 7 months ago

@yarosevych, could you maybe explain where the motivation for the restrictions implemented in 9478d39 and following comes from? Was there a concrete problem with allowing more / arbitrary characters?

One idea was to redesign the API to use entity names instead of numeric IDs and have a structure similar to the one of GitHub. This imposes stricter requirements to ensure proper URLs (basically something like [\w\.-]*). For example, instead of api/v1/organizations/1 or api/v1/products/2 the endpoints would look like api/v1/organization-name or api/v1/organization-name/product-name.

Does it make sense to add these in API, if the API is practically always accessed programmatically? It could make sense for frontend URLs, but IMO it should be done as a separate slug. That's the way GitHub for example does it, Theres an organization with a name and slug of Apoapsis project/eclipse-apoapsis, or OSS Review Toolkit/oss-review-toolkit.

Using names as accessor IDs would also be problematic if running single ORT Server for multiple tenants. Unlike GitHub as a social platform, in ORT Server it probably isn't required to prevent organizations with duplicate names, so accessing shouldn't be done with names.

sschuberth commented 7 months ago

One idea was to redesign the API to use entity names instead of numeric IDs

Hmm. That means entry point wouldn't be stable when entities get renamed. That do not seem to be a good idea to me.

Potential benefits would be that it would be harder to guess the endpoints for entities you have no access to

To account for that (and have stable API endpoints when renaming entities) I guess we should create and maintain UUIDs for entities once they are created, or?

mnonnenmacher commented 7 months ago

Does it make sense to add these in API, if the API is practically always accessed programmatically?

For programmatic use that is not relevant, but during development and testing I have done a lot a manual requests and found it very annoying to have to juggle with numeric ids all the time when I could have remembered the names instead.

It could make sense for frontend URLs, but IMO it should be done as a separate slug.

I totally agree with that.

Using names as accessor IDs would also be problematic if running single ORT Server for multiple tenants. Unlike GitHub as a social platform, in ORT Server it probably isn't required to prevent organizations with duplicate names, so accessing shouldn't be done with names.

From our perspective the organizations represent the different tenants, having separate tenants with the same name was not a use case we considered.

That means entry point wouldn't be stable when entities get renamed.

That could be mitigated with redirects like GitHub does.

To account for that (and have stable API endpoints when renaming entities) I guess we should create and maintain UUIDs for entities once they are created, or?

UUIDs make it even harder to use the API manually.

mmurto commented 7 months ago

Using names as accessor IDs would also be problematic if running single ORT Server for multiple tenants. Unlike GitHub as a social platform, in ORT Server it probably isn't required to prevent organizations with duplicate names, so accessing shouldn't be done with names.

From our perspective the organizations represent the different tenants, having separate tenants with the same name was not a use case we considered.

I agree that the organizations represent the different tenants. And as ORT Server probably isn't meant for sharing things between these tenants/organizations, unlike GitHub, it IMO makes no sense to block someone from creating an organization even if an organization with the same name exists.

sschuberth commented 6 months ago

For whatever reason auto-closing again did not work, but this can be closed now @Etsija as https://github.com/eclipse-apoapsis/ort-server/pull/84 has relaxed the situation.

Etsija commented 6 months ago

I'm reluctant to close it, until I can properly verify the issue is really solved:

  1. I did a complete reinstall of ORT Server dev environment:
    • rebased our feature branch to the newest main in upstream (this commit: aaed6a9c (origin/main, origin/HEAD) feat(api): Allow more special charaters in organization and product names)
    • rebuilt all base images
    • did a complete Jib build
    • started the containers with docker compose
  2. I even rebuilt the OpenAPI.ts with pnpm codegen in the /ui directory -> it introduced only a minor VERSION bump in OpenAPI structure
  3. Connected to the UI
  4. Tried to rename "Double Open Server" product into "Double Open Server (DOS)"

Result: Capture

Same happens when trying to PATCH organization name.

It is very probable that I have omitted a crucial step from the above, but I feel we're probably yet missing some info/guidelines/policies on how to properly map changes in Server API with front-end.

Etsija commented 6 months ago

I just verified that relaxed naming validation works with the POST requests for organizations, products - but not with PATCH requests.

The reason seems to be that NAME_PATTERN_REGEX used in UpdateOrganization() class is ^(?!\s)[A-Za-z0-9- ]*(?<!\s)$ which doesn't match eg. "Test (test2)", while the regex used for CreateOrganization() class differs from the former, and matches. Is there a reason why we don't have identical regex for validating POST and PATCH requests?

sschuberth commented 6 months ago

I just verified that relaxed naming validation works with the POST requests for organizations, products - but not with PATCH requests.

Thanks for checking. This should get fixed with https://github.com/eclipse-apoapsis/ort-server/pull/96.

sschuberth commented 6 months ago

Thanks for checking. This should get fixed with #96.

It was merged, please try again @Etsija.

Etsija commented 6 months ago

Tested, works now, will close. Thanks for all for fixing this!