caddyserver / caddy

Fast and extensible multi-platform HTTP/1-2-3 web server with automatic HTTPS
https://caddyserver.com
Apache License 2.0
58.12k stars 4.03k forks source link

[Question] Host modules #6494

Closed Z3NTL3 closed 2 months ago

Z3NTL3 commented 2 months ago

The inline_key= part is only used if the module's name will be found inline with the module itself; this implies that the value is an object where one of the keys is the inline key, and its value is the name of the module. If omitted, then the field type must be a caddy.ModuleMap or []caddy.ModuleMap, where the map key is the module name.

Goal I try to understand exactly what is meant by the sentence above, for that, I'll explain step for step what I've understood from it. And I ask for a maintainer of Caddy to correct any misunderstanding.

Trying to assume what is meant by the complete sentence I found this very confusing. We provide the namespace and Caddy looks up in that namespace for the specific module or modules. I get that part. However providing inline_key tells Caddy to only include modules within the namespace that have the inline key, and it's value being the module name. We know that in Caddy everything is standardized JSON. So if I have a host module with raw struct field:

caddy:"namespace=http.handlers inline_key=handler", does that mean Caddy wile load all handlers in the given namespace only if they have a handler key.

For example


type SomeMwHandler struct {
          Handler string `json:"handler"`
          ...
}

In Caddy config it's translated to:


{"handler": "module_name"}

In the other hand, if inline_key is not used, Caddy will import everything that satisfies the specific namespace and in the RAW JSON it's provided as caddy.ModuleMap: containing all modules that comfort the namespace ?

I just try to understand, but am I correct? I've gone through the source of Caddy itself, I think the plugin architecture is pretty cool, however for Host and App modules there can be more documentary and examples that explain it more comprehensively.

Z3NTL3 commented 2 months ago

And another thing I found in the documentation of context.go:

// To load modules, a "namespace" key is required. For example, to load modules // in the "http.handlers" namespace, you'd put: namespace=http.handlers in the // Caddy struct tag. // // The module name must also be available. If the field type is a map or slice of maps, // then key is assumed to be the module name if an "inline_key" is NOT specified in the // caddy struct tag. In this case, the module name does NOT need to be specified in-line // with the module itself.

So I think I was a bit correct, but now together with the last sentence, It's vague cuz it's saying the Raw field provided to LoadModule must be already filled, but does Caddy not load/fill this before Provision, and I would assume it puts all modules under the given namespace in the field (type Module.Map) . Using host module http.handlers work, for me it was enough to just register it under the namespace, and http.handlers host module just magically has it's HandlersRaw filled, but on the other hand for http.authenticator.providers I have to provide my providers explicitly through ProvidersRaw in Authenticator which is very unclear... One more thing to note is the docs stated: Caddy module behavior is implicit, so assuming those raw's are filled in is correct.

Again another part of the doc:

If a guest module must explicitly be set by the user, you should return an error if the Raw field is nil or empty before trying to load it.

Some of the namespaces require their Raw to be explicitly loaded by the user? pff...

francislavoie commented 2 months ago

If the host module declares that the guest has an inline key, then users writing JSON config must have the inline key inside the JSON object to say which exact module this guest is. The module struct itself should not contain the inline key as a field, it gets stripped out when unmarshaling (i.e. in your SomeMwHandler example, remove the Handler field, it never gets used).

If a ModuleMap type is used instead by the host module as a field in its struct, then you don't have the module name inside the guest module object as an inline key, and instead the guest module's name is the key in the module map. In JSON, the ModuleMap is an object, and every key in that object is going to be a module name, and the values are the modules themselves. A module does not have to be an object type, for example look at HTTP matchers where the path matcher (MatchPath module) is a slice of strings. ModuleMap allows for this because the map keys are the module names so there's no necessity for an inline key to declare the name of the module.

ModuleMap only makes sense in a few key scenarios though, like when the order of configured modules does not matter (order would be preserved if it was an array of objects in JSON instead of a map), and when there should be only a single copy of a given module type. In practice, this means that you can't have two path matchers applied to one handler (because the key in the module map is unique, can only be used once), but since the path matcher is a slice of strings, you can have a single path matcher module with multiple path values; the path matcher will OR the values (so the first matching path will short circuit the matcher and return true for a match).

The Raw fields are basically meant to hold the raw JSON bytes temporarily while the config is being loaded, and once Provision is called then guest modules get loaded into another struct field in the host module, usually using the interface type for that namespace to assert that any guest modules actually implement the interface the host expects (HTTP handlers must have ServeHTTP as per the caddyhttp.MiddlewareHandler interface). See here:

https://github.com/caddyserver/caddy/blob/840094ac65c2c27dbf0be63478d36969a57ce7e0/modules/caddyhttp/routes.go#L90

You can see that inside a route, you have HandlersRaw which has an inline key, and its type is a slice of JSON raw, meaning that the config must be an array of objects, and those objects must have a handler key whose value is the name of the module. Then if you read Provision, you see it calls ProvisionHandlers, which uses LoadModule to transform the raw JSON into a slice of module objects. LoadModule takes the field name to load from as an argument, then uses reflection to look at that fields tags to find out whether it should expect to read for an inline key, and if it finds and structs that don't have the inline key, it will throw an error because it can't determine what type of module to load.

If a guest module must explicitly be set by the user, you should return an error if the Raw field is nil or empty before trying to load it.

This is basically saying if your host module only takes a single guest module (not a slice of them like HTTP handlers, but instead like a single DNS provider module for solving ACME DNS challenges) then you can't have a nil object, your code should check for this.FooRaw == nil before calling LoadModule.

Hopefully that helps, I'm on vacation on my phone right now so this was slow to type, not sure if that covers everything or not

Z3NTL3 commented 2 months ago

That makes sense. I got an additional question if you don't mind.

Given this:

// Gizmo is an example; put your own type here.
type Gizmo struct {
    HandlersRaw caddy.ModuleMap `json:"handlers,omitempty" caddy:"namespace=http.authentication.providers"`
    Handlers map[string]caddyauth.Authenticator `json:"-"`
}

// CaddyModule returns the Caddy module information.
func (Gizmo) CaddyModule() caddy.ModuleInfo {
    return caddy.ModuleInfo{
        ID:  "http.handlers.gizmo",
        New: func() caddy.Module { return new(Gizmo) },
    }
}

func (g *Gizmo) Provision(ctx caddy.Context) error {
    // g.HandlersRaw = caddy.ModuleMap{
    //  "http_basic": make(json.RawMessage, 0),
    // }
    mods, err := ctx.LoadModule(g, "HandlersRaw")
    if err != nil {
        fmt.Printf("err while loading %s\n", err)
        return err
    }

    v := mods.(map[string]any)

    fmt.Printf("mods: %+v %+v\n", mods, v["http_basic"])
    return nil
}

I get:

empty map 

If I uncomment the commented parts I get:

mods: map[http_basic:0xc00059de80] &{HashRaw:[] AccountList:[] Realm: HashCache:<nil> Accounts:map[] Hash:0x3274640 fakePassword:[36 50 97 36 49 52 36 88 51 117 108 113 102 47 105 71 120 110 102 49 107 54 111 77 90 46 82 90 101 74 85 111 113 73 57 80 88 50 80 77 52 114 83 53 108 107 73 75 74 88 100 117 76 71 88 71 80 114 116 54]}

Means, that the Raw needs to be explicitly filled instead of that Caddy knows: Hey, the struct tag is given me a namespace, and the raw was not prefilled, thus I will return caddy.ModuleMap which is map[string]any and fill all modules in this namespace into the Raw field. Type asserting later to whatever namespace it is.

But given the previous example, it didn't do that and I had to manually tell Caddy. If I just don't want Caddy loading all modules, then I can explicitly tell which modules I want like above, I would find that more comfortable.

This is how I think due to somewhere in the doc, I've read that Caddy handles modules implicitly like how Go does with interfaces (and I know that Caddy means interfaces with namespace, Caddy can know what stuff that got plugged into it belongs to what namespace, then why was LoadModule's design not accomodated to this style.)

However IDK if I understood it all correctly, will not write a host module but only guest modules. I am a student and was fun digging into a bit, just for fun.

I would appreciate any eloboration to my final question whenever you have time for it, and I am grateful for your answer to my first question regardless of being on vacation 😂, thnx.

Btw have great fun and a nice vacation!

francislavoie commented 2 months ago

Hmm well it depends what you passed as your JSON config for that module. What's the config you used? Can't really look at how a module runs in isolation without the config fed into it. With your commented bit, you're kinda mocking the config.

I'm not exactly sure what your goal is with that. What exactly are you trying to achieve?

Remember that ModuleMap only makes sense in a few rare scenarios, your default should be to use json.RawMessage

Z3NTL3 commented 2 months ago

Let's forget about my previous message, it was a confusion. Now I want to ask my final question, and I think this one is more concrete and clear about why I am getting confused.

About
http.authentication.providers

Having this:

https://github.com/caddyserver/caddy/blob/master/modules/caddyhttp/caddyauth/caddyfile.go#L92C1-L96C8 https://github.com/caddyserver/caddy/blob/master/modules/caddyhttp/caddyauth/caddyauth.go#L40C1-L48C2

And then a guest plugin like this:

https://github.com/golgeek/caddy-auth-jwt/blob/main/caddyfile.go#L356C1-L360C8

Host module of http.authentication.providers created by the Caddy team creates an Authenticator instance while it serializes Caddyfile tokens into normal JSON (talking about: parseCaddyfile).

And a guest plugin registers another Authenticator in it's Caddyfile parsing function. To clarify, both contain different providers, so there will be seperate instances of Authenticator (host) with different providers(guests)... How will Caddy manage to handle both correctly? Will it eventually concat them?

And additionally, another question. If I make a host module and use Caddyfile config, how will I need to provide the inline_key, cuz as you said, it shouldn't be in the struct , and there is no logical way of providing an opaque JSON key because Caddy has the final control of marshalling the Caddyfile tokens into JSON in accordance to UnmarshalCaddyfile which holds my struct, so how should I provide an opaque key without including it in my struct when I do use Caddyfile config.

func(*g Gizmo) UnmarshalCaddyfile(d *dispenser) error
mholt commented 2 months ago

The difference looks like this:

Inline key:

{
    "my_inline_key": "module_name",
    ...
}

Useful for keeping the name of the module together with the rest of its config.

In other cases, the module name might be an object key instead of an object value:

"module_name": {
    ...
}

There's no "inline key" here because the module name appears as part of a parent object (usually a map) so what follows as the value is strictly the module's configuration.

Z3NTL3 commented 2 months ago

The difference looks like this:

Inline key:

{
    "my_inline_key": "module_name",
    ...
}

Useful for keeping the name of the module together with the rest of its config.

In other cases, the module name might be an object key instead of an object value:

"module_name": {
    ...
}

There's no "inline key" here because the module name appears as part of a parent object (usually a map) so what follows as the value is strictly the module's configuration.

Thanks to clarify, can you also have a look at my latest question

francislavoie commented 2 months ago

Authenticator is the interface but, HTTPBasicAuth is the module. It has a module ID http_basic, as declared here:

https://github.com/caddyserver/caddy/blob/master/modules%2Fcaddyhttp%2Fcaddyauth%2Fbasicauth.go#L73

Then in the ModuleMap the key http_basic is used, knowing the namespace is http.authentication.providers so when the module is loaded, a lookup is made for http.authentication.providers.http_basic which will be found because the module was registered (in init() which always runs first-thing when the program starts before anything else).

In the Caddyfile, it's not 1:1 to JSON. The basic_auth directive will produce an instance of authentication containing http_basic as a provider. The Caddyfile doesn't give you a way to have one authenticator with two providers right now, but it's possible if you write your JSON config by hand. Use caddy adapt -p to transform a Caddyfile to its JSON representation.

Z3NTL3 commented 2 months ago

Thanks for the clarification. When I inspected that, I was just like, hmmm. How does Caddy manage to comprise different instances of Authentication namespace/interface, when they both define different providers and how would they be arranged again into one singular instance.

Because I did also see developers making plugins for the Authenticator namespace/interface and what they just do in their parseCaddyfile function (Caddyfile support: RegisterHandlerDirective : has to return MiddlewareHandler), to comfort that they'll use Authentication, and provide their guest module in its ProvidersRaw.

return Authentication{
    ProvidersRaw: caddy.ModuleMap{
        "their_provider": caddyconfig.JSON(their_guestModuleStruct, nil),
    },
}, nil

Anyways, is there a way of arranging more providers into the first Authenticator instance programatically, without creating more instances. I've inspected caddyconfig and caddyfile and could not find anything useful, only .Adapt however I think it just adapts given body to JSON and does not directly register it under Caddy's JSON config.

mholt commented 2 months ago

Anyways, is there a way of arranging more providers into the first Authenticator instance programatically, without creating more instances.

Every time a module is configured, a new instance is created. If you need to share state/data, you can use global vars or a UsagePool.

francislavoie commented 2 months ago

Right now we don't have Caddyfile support for a single authenticator with multiple providers, like I said, but you can do it via JSON. Nobody has requested support for that yet, so we didn't need to implement it. But we could if necessary. I don't think there's really that many auth provider plugins though, we'd need an actual usecase where it makes sense before we implement it.

Z3NTL3 commented 2 months ago

I was a bit confused, but seem to realize now. I want you to follow my steps to see the reason for my confusion.

Summary

So in contrast the difference was, while writing a guest module for http.authentication.providers I had to set up an instance of this host, which is a guest to http.handlers and filling it's ProvidersRaw within my own guest module, and all that, again... IN my own guest module. So that concludes to that my guest module is theoritically seen both host and guest...

While with http.handlers it was enough to satisfy the namespace and I didn't need to create an instance of the Host module in my own guest module and providing my guest module in it's HandlersRaw my self. So I wrote 2 guest modules, but both developer experience were just different which is odd and made me confused. The experience of writing a guest to http.handlers was amazingly good. For http.authentication.providers i expected the steps being the same, cuz I am writing a guest for it... Thus I expect that there is something identical when writing guest modules and not some difference dependent on the namespace (with this i dont mean anything about satisfying interface charasteristics).

francislavoie commented 2 months ago

Yeah I agree auth modules are weird. Matt tried to design it to be pluggable by trying to make a generic Authentication wrapper which could take providers. Doing it that way isn't really necessary, you could write your own auth handler without Authentication if you want. The only thing the Authentication wrapper actually does as you can see here https://github.com/caddyserver/caddy/blob/b1986781740b2b7f546c6ce56496c5cc7145674d/modules/caddyhttp/caddyauth/caddyauth.go is loop over the list of configured providers then fill in some http.auth.user.* variables with the result. That's it.

But obviously, see https://github.com/caddyserver/caddy/blob/b1986781740b2b7f546c6ce56496c5cc7145674d/modules/caddyhttp/caddyauth/basicauth.go as a example of a complete and functional auth provider module.

mholt commented 2 months ago

Yeah, my initial thought was that it could be useful to have a central place for auth to go in a handler chain, with the added benefit of being able to allow multiple auth sources if needed. Then you don't have to deal with much of the HTTP stuff, making development of the actual auth component less risky and error-prone.

Z3NTL3 commented 2 months ago

Everything is fine, but I think whenever is the right time, I would recommend providing more examples in the Docs to give a clearer view. Would be helpful to most people.

Now everything is being clear and set, I would like to express my feelings with Caddy again. This is an amazing project, far beyond anything else, even it's plugin architecture being so unique and elegant, is really dope. I've never had experienced writing Plugins with fun. Keep up the great work, and I am grateful for all your efforts for explaining everything!

My sincere greetings

mholt commented 2 months ago

Sounds good. Thanks for the feedback!