NatLibFi / Annif

Annif is a multi-algorithm automated subject indexing tool for libraries, archives and museums.
https://annif.org
Other
204 stars 41 forks source link

Make `/v1/detect-language` REST API method usable Cross-Origin / CORS Preflight Request Failing #817

Open juhoinkinen opened 2 weeks ago

juhoinkinen commented 2 weeks ago

While implementing the language detection feature for FintoAI (https://github.com/NatLibFi/FintoAI/issues/9, https://github.com/NatLibFi/FintoAI/pull/21), I encountered an issue with the detect-language endpoint of Annif. When making a request to it, a local Annif instance logs the following error:

OPTIONS /v1/detect-language HTTP/1.1" 400 Bad Request

This seems to be because the POST request is not a "simple request" due to the Content-Type being application/json. In contrast, the suggest endpoint uses application/x-www-form-urlencoded, making it a simple request (see documentation).

Steps to Reproduce

  1. Make a POST request to the detect-language endpoint with Content-Type: application/json from Javascript code running in a browser (not from the Annif instance itself).
  2. Observe the 400 Bad Request error in the Annif logs and a CORS error in browser.

Expected Behavior

The server should handle the OPTIONS preflight request correctly and allow the POST request to proceed.

Additional Context

I consulted Claude.ai and received the following insights:

When a web page makes a cross-origin request, the browser sends an OPTIONS preflight request to check if the resource is accessible. The server must respond with appropriate CORS headers. If not, the browser blocks the POST request, resulting in a "400 Bad Request" error.

The API endpoint should define the OPTIONS method and respond with the necessary CORS headers. This can be specified in the OpenAPI v3 spec.

Example OpenAPI v3 spec for OPTIONS method:

paths:
  /v1/detect-language:
    options:
      summary: Preflight request
      responses:
        '204':
          description: Successful response
          headers:
            Access-Control-Allow-Origin:
              schema:
                type: string
            Access-Control-Allow-Methods:
              schema:
                type: string
            Access-Control-Allow-Headers:
              schema:
                type: string
    post:
      summary: Language detection endpoint
      # other POST request details

By defining the OPTIONS method and setting the appropriate CORS headers, the server should handle the preflight request correctly, allowing the POST request to proceed without errors.

Example of OPTIONS method OpenAPI 3.0 definition in AWS documentation here.

Note that when Annif is deployed and served from a web domain, the request to Annif comes from the same origin, and CORS is not needed anyway; this issue concerns only using the language detection feature from other websites (and local development, but the CORS protection can be disabled e.g. with Chromium by starting it with chromium --disable-web-security --user-data-dir=~/tmp.

Resolution options

We update the Annif API to include the OPTIONS method for the detect-language endpoint and ensure it responds with the necessary CORS headers.

Alternatively we could make the detect-language method accept data type application/x-www-form-urlencoded.

juhoinkinen commented 2 weeks ago

A question is whether we want the language detection functionality of Annif instances to be directly usable by other websites (the current situation does not restrict the use of via other means than direct browser requests).