processone / ejabberd

Robust, Ubiquitous and Massively Scalable Messaging Platform (XMPP, MQTT, SIP Server)
https://www.process-one.net/en/ejabberd/
Other
6.01k stars 1.5k forks source link

HTTP upload broken when public PUT URL path differs from request_handler definition #4060

Open sideeffect42 opened 1 year ago

sideeffect42 commented 1 year ago

Environment

Configuration (only if needed): grep -Ev '^$|^\s*#' ejabberd.yml

listen:
  -
    port: 5280
    ip: "::1"
    transport: tcp
    module: ejabberd_http
    request_handlers:
      /upload: mod_http_upload
      /.well-known/host-meta: mod_host_meta
      /.well-known/host-meta.json: mod_host_meta
modules:
  mod_http_upload:
    hosts:
      - upload.@HOST@
    access: local
    get_url: "https://upload.@HOST@/"
    put_url: "https://upload.@HOST@/"
    docroot: "@HOME@/upload"
    dir_mode: "0750"
    file_mode: "0644"
    rm_on_unregister: true
    jid_in_url: sha1
    thumbnail: false
    custom_headers:
      "Access-Control-Allow-Origin": "https://upload.@HOST@/"
      "Access-Control-Allow-Methods": "GET,HEAD,PUT,OPTIONS"
      "Access-Control-Allow-Headers": "Content-Type"

Errors from error.log/crash.log

2023-07-05 18:09:22.480526+02:00 [info] <0.390.0>@ejabberd_listener:accept/7:273 (<0.576.0>) Accepted connection hidden_by_ejabberd -> [::1]:5280
2023-07-05 18:09:22.481919+02:00 [debug] <0.576.0>@ejabberd_http:init/3:129 S: [{[<<"upload">>],mod_http_upload},
    {[<<".well-known">>,<<"host-meta">>],mod_host_meta},
    {[<<".well-known">>,<<"host-meta.json">>],mod_host_meta}]

2023-07-05 18:09:22.483200+02:00 [debug] <0.576.0>@ejabberd_http:process_header/2:290 (#Port<0.44>) http query: 'PUT' <<"/upload/01a66df340fbde62dc61c6746f33783061170e87/ZR7WpuNlK22WQhj3GVWoTouTBF2btSE9QnxEdl9E/20221221_201620_20221223230532.jpg">>

2023-07-05 18:09:22.484977+02:00 [debug] <0.576.0>@ejabberd_http:process/1:362 [<<"upload">>,<<"01a66df340fbde62dc61c6746f33783061170e87">>,
 <<"ZR7WpuNlK22WQhj3GVWoTouTBF2btSE9QnxEdl9E">>,
 <<"20221221_201620_20221223230532.jpg">>] matches [<<"upload">>]
2023-07-05 18:09:22.486212+02:00 [warning] <0.576.0>@mod_http_upload:process/2:570 Cannot handle PUT request from hidden_by_ejabberd for upload.example.com: Upload not configured for this host

Bug description

To have beautiful HTTP URLs for the HTTP upload file transfer module I have installed a web server in front of ejabberd which takes requests on https://upload.example.com/ and forwards them to ejabberd's local port (http://[::1]:5280/upload/).

ejabberd prints the misleading error message Upload not configured for this host when the path of the request does not match the one configured in mod_http_upload's put_url.

Creating a separate local port whose only request_handler is /: mod_http_upload HTTP uploads work as expected.

badlop commented 1 year ago

mod_http_upload starts an erlang process for each module instance, taking into account the put_url option. For example, for this configuration:

hosts:
 - example.com
 - test.org

listen:
  -
    port: 5443
    module: ejabberd_http
    request_handlers:
      /uppy: mod_http_upload

modules:
  mod_http_upload:
    put_url: https://upper.@HOST@:5443/uppy

at module start, two erlang processes are started, with those process names:

Later, the client sends a query like this to upload the file:

$ curl -X PUT http://upper.example.com:5443/uppy/9fcef0....txt -d "content"

ejabberd determines the erlang process name by taking into account the URL provided by the client: mod_http_upload + Host + URLpath


The put_url option exists because ejabberd cannot determine itself the proper PUT URL to tell the XMPP client to use.

Ideally, ejabberd should be able to determine automatically the PUT URL, based in the listen and the modules configuration.

For example, captcha_url detects automatically the correct value, and is not needed nowadays anymore.


In your setup, you are using the put_url option outside the boundaries of its design: you put a path in put_url, and then sending it to a different path. You tell URL path = "/" in the put_url option, but later the clients are providing an URL path that includes "/upload".

If you plan to use the put_url to set some custom and unsupported forwarding, then it is your duty to ensure that it works correctly.

When ejabberd searches for the erlang process, it cannot be found, and complains saying that upload is not configured for upload.example.com. In reality it IS configured... it is wrongly configured.

Later, when you configure the same URL path in put_url than the ones used by the HTTP client, the erlang process name is correctly determined.


Now let's see what can be improved in ejabberd.

The error message should be more precise. Or preferably less precise, because who knows what the server admin may have set, and what exactly is the problem in the configuration? In this case, the problem is not the host, but the path, and the error message is misleading.

The put_url option could accept an auto value, and try to determine the proper URL based in the existing configuration. Of course, still allow the admin to setup manually an URL, for cases like yours that you want to setup some customization. But in that case, it's the admin's duty to setup the proper URL and not break things :)

And another possible feature request: is it expected that ejabberd should be able to determine magically what URL forwardings, port forwardings, domain forwardings... that the server administrator may setup?

weiss commented 1 year ago

@badlop, everything you say is correct, and your ideas make sense. However, the solution I have in mind would be to get rid of the URL<->process mapping altogether by having the PUT handler accept queries based on the external_secret mechanism (or actually, a newer revision of that protocol, as that would allow the PUT handler to check for the JID, among other things). This would avoid the need of a process for storing "upload slots", it would play nicer with clustering, and would solve another few issues, including this one. I have this on my list.