Open Freddybob4244 opened 5 months ago
Thanks for the detailed report!
In a Slack convo Anton suggested that #12573 may be suspect.
Specifically, the only SSO change in v3.5.5 was #12318, which is unrelated in this case (different provider, different claim). So my suspicion would then be that a deps change somehow impacted this, and #12573 upgraded expr
which is used for rbac-rule
evaluation here and is the part with the error:
[...] failed to evaluate rule: unknown name groups (1:43)\n | 'ac1ef805-ac6a-4ce9-854a-1cb406aa7121' in groups\n | ..........................................^"
@Freddybob4244 could you provide a few more details that I had asked for in my last message:
rbac-rule
annotation you were using hereac1ef805-ac6a-4ce9-854a-1cb406aa7121
actually one of your groups?Admin users can navigate normally.
Also can you provide the admin SA with its rbac-rule
as well? I'm curious how it differs since it applies correctly.
Happy to provide any more information on request or connect to experiment/work through the issue
From the same message, could you try some of these options and check what the logs say:
groups
in
I'm not sure if we have a robust test environment for SSO issues; we do have an optional Dex but I'm not sure if it's configured and used for SSO tests (I don't think it is IIRC). That would be great to have for reproducibility and to add test cases for the pieces of SSO that are not provider specific.
From DMs:
So I am back from my long weekend - turns out that v3.4.17 has the same problem. I checked the release and it seems to have the expr change. I pushed v3.4.16 (which doesn't have that change) and it works.
That seems to confirm that #12573 is the source of the issue here, now we need to figure out why it's causing this exactly and how to fix that (and if it requires another upstream fix in expr
).
Also that expr
upgrade seems to have been erroneously backported to Argo v3.4.17 per https://github.com/argoproj/argo-workflows/pull/13043#discussion_r1610560201
Confirmed with a different user in another Slack thread that this is a regression in 3.5.5 and that reverting to 3.5.4 fixed it
@isubasinghe any chance you could take a look at this? Related to the expr
upgrade and changes you made in #12573. It's a P1 given that it breaks SSO RBAC
Let's fix the error. From what I can see the code is:
'ac1ef805-ac6a-4ce9-854a-1cb406aa7121' in groups
Which seems legit.
Can you show an example expression with an error?
@antonmedv error="failed to evaluate rule: unknown name groups (1:43)\n | 'ac1ef805-ac6a-4ce9-854a-1cb406aa7121' in groups\n | ..........................................^"
https://github.com/argoproj/argo-workflows/pull/12573/files#diff-ace0e536d9a8878e3c778026bd37f97d15bd3d47f889ad972704a2a2ba11878a was changed at same time as the version upgrade. it is called from https://github.com/argoproj/argo-workflows/blob/06da23e8660f7841592070b1a372ace0750c2364/server/auth/gatekeeper.go#L264
The error is
unknown name groups
Which means there is no groups
in env.
i tried below in go playground (https://go.dev/play/) but could not reproduce the error:
package main
import (
"encoding/json"
"fmt"
"github.com/antonmedv/expr"
"gopkg.in/square/go-jose.v2/jwt"
)
type CustomClaims struct {
Issuer string `json:"iss,omitempty"`
Subject string `json:"sub,omitempty"`
Audience jwt.Audience `json:"aud,omitempty"`
Expiry *jwt.NumericDate `json:"exp,omitempty"`
NotBefore *jwt.NumericDate `json:"nbf,omitempty"`
IssuedAt *jwt.NumericDate `json:"iat,omitempty"`
ID string `json:"jti,omitempty"`
Groups []string `json:"groups,omitempty"`
}
func Jsonify(v interface{}) (map[string]interface{}, error) {
data, err := json.Marshal(v)
if err != nil {
return nil, err
}
x := make(map[string]interface{})
return x, json.Unmarshal(data, &x)
}
func main() {
//claims
//data, err := json.Marshal(claims)
// if err != nil {
// panic(err)
// }
// v := make(map[string]interface{}) //env
// err = json.Unmarshal(data, &v)
// if err != nil {
// panic(err)
// }
//env := map[string]interface{}{
// "groups": []string{"abc", "ac1ef805-ac6a-4ce9-854a-1cb406aa7121"},
//}
//type Claims struct {
// jwt.Claims
// Groups []string `json:"groups,omitempty"`
// Email string `json:"email,omitempty"`
// EmailVerified bool `json:"-"`
// Name string `json:"name,omitempty"`
/// ServiceAccountName string `json:"service_account_name,omitempty"`
// ServiceAccountNamespace string `json:"service_account_namespace,omitempty"`
// PreferredUsername string `json:"preferred_username,omitempty"`
// RawClaim map[string]interface{} `json:"-"`
// }
//claims := Claims{
// Claims: jwt.Claims{
// // Initialize fields of jwt.Claims as needed
// },
// Groups: []string{"abc", "ac1ef805-ac6a-4ce9-854a-1cb406aa7121"},
//}
type Claims struct {
CustomClaims
Email string `json:"email,omitempty"`
EmailVerified bool `json:"-"`
Name string `json:"name,omitempty"`
ServiceAccountName string `json:"service_account_name,omitempty"`
ServiceAccountNamespace string `json:"service_account_namespace,omitempty"`
PreferredUsername string `json:"preferred_username,omitempty"`
RawClaim map[string]interface{} `json:"-"`
}
claims := Claims{
CustomClaims: CustomClaims{
Groups: []string{"abc", "ac1ef805-ac6a-4ce9-854a-1cb406aa7121"},
},
}
//claims := Claims{Claims: jwt.Claims{"Groups": []string{"abc", "ac1ef805-ac6a-4ce9-854a-1cb406aa7121"}}}
v, err := Jsonify(claims)
if err != nil {
panic(err)
}
fmt.Println(v)
input := `'ac1ef805-ac6a-4ce9-854a-1cb406aa7121' in groups
` //rule
result, err := expr.Eval(input, v)
if err != nil {
panic(err)
}
fmt.Println(result)
}
-- go.mod --
module play.ground
require github.com/antonmedv/expr v1.15.5
package main
import (
"encoding/json"
"fmt"
"github.com/expr-lang/expr"
"gopkg.in/square/go-jose.v2/jwt"
)
type CustomClaims struct {
Issuer string `json:"iss,omitempty"`
Subject string `json:"sub,omitempty"`
Audience jwt.Audience `json:"aud,omitempty"`
Expiry *jwt.NumericDate `json:"exp,omitempty"`
NotBefore *jwt.NumericDate `json:"nbf,omitempty"`
IssuedAt *jwt.NumericDate `json:"iat,omitempty"`
ID string `json:"jti,omitempty"`
Groups []string `json:"groups,omitempty"`
}
func Jsonify(v interface{}) (map[string]interface{}, error) {
data, err := json.Marshal(v)
if err != nil {
return nil, err
}
x := make(map[string]interface{})
return x, json.Unmarshal(data, &x)
}
func main() {
//type Claims struct {
// jwt.Claims
// Groups []string `json:"groups,omitempty"`
// Email string `json:"email,omitempty"`
// EmailVerified bool `json:"-"`
// Name string `json:"name,omitempty"`
/// ServiceAccountName string `json:"service_account_name,omitempty"`
// ServiceAccountNamespace string `json:"service_account_namespace,omitempty"`
// PreferredUsername string `json:"preferred_username,omitempty"`
// RawClaim map[string]interface{} `json:"-"`
// }
//claims := Claims{
// Claims: jwt.Claims{
// // Initialize fields of jwt.Claims as needed
// },
// Groups: []string{"abc", "ac1ef805-ac6a-4ce9-854a-1cb406aa7121"},
//}
type Claims struct {
CustomClaims
Email string `json:"email,omitempty"`
EmailVerified bool `json:"-"`
Name string `json:"name,omitempty"`
ServiceAccountName string `json:"service_account_name,omitempty"`
ServiceAccountNamespace string `json:"service_account_namespace,omitempty"`
PreferredUsername string `json:"preferred_username,omitempty"`
RawClaim map[string]interface{} `json:"-"`
}
claims := Claims{
CustomClaims: CustomClaims{
Groups: []string{"abc", "ac1ef805-ac6a-4ce9-854a-1cb406aa7121"},
},
}
//claims := Claims{Claims: jwt.Claims{Groups: []string{"abc", "ac1ef805-ac6a-4ce9-854a-1cb406aa7121"}}}
v, err := Jsonify(claims)
if err != nil {
panic(err)
}
input := `'ac1ef805-ac6a-4ce9-854a-1cb406aa7121' in groups
` //rule
program, err := expr.Compile(input, expr.Env(v))
if err != nil {
panic(err)
}
result, err := expr.Run(program, v)
if err != nil {
panic(err)
}
fmt.Println(result)
}
-- go.mod --
module play.ground
require github.com/expr-lang/expr v1.16.0
i tried below in go playground (https://go.dev/play/) but could not reproduce the error:
🤔 There were two separate users who encountered it though, and both resolved by downgrading. That would suggest that something in the diff between your code and Argo's is the buggy piece
Also @tooptoop4 there's a "Share" button on the playground that's very helpful 😅
My intuition is that the Eval
-> Compile
+ Run
change is responsible somehow, as I think you suspect too
Is it when groups
is empty maybe? The Jsonify
(that naming is confusing as it's still very much a struct) is perhaps excluding it from the JSON and resulting map?
That would add up since so far the reports have only been from users who were just starting to add SSO RBAC, and not from existing users of that feature (and would explain why more people haven't upvoted this as well)
Can confirm in the playground, changing it to an empty array in the Compile
+ Run
variant causes the error:
Groups: []string{},
panic: unknown name groups (1:43)
| 'ac1ef805-ac6a-4ce9-854a-1cb406aa7121' in groups
| ..........................................^
goroutine 1 [running]:
main.main()
/tmp/sandbox3921589064/prog.go:80 +0x18e
There is no error, just prints false
in the Eval
variant:
map[]
false
The Eval
variant works the same in expr
1.15.5 and 1.16.0, so it's definitely the Compile
+ Run
change that break this for empty arrays. Also tried the Compile
+ Run
variant in 1.15.5 and that has the same error there.
Thanks @tooptoop4 for making a small repro in the playground for this to make it easier to investigate!
~Seems to still happen even when I remove the Jsonify
and use claims
directly. So either the Compile
statement or the expr.Env
is causing the empty string to be excluded~ EDIT: Actually, I think that fails because of capitalization and/or lack of expr
struct tag (which would have the lowercase variant similar to the json
struct tag)
Removing the omitempty
in the json
struct tag works. That may have other side-effects though 🤔 There might be a better way around that 🤔
Yea omitempty
should technically be removed for all of the fields in that case, which would indeed have side-effects. This may very well affect other places that use templating as well.
The better option would be to use expr
struct tags and potentially pass in empty structs.
@antonmedv would it make sense for expr
to interpret json
struct tags? I.e. in the case a json
struct tag is present and no expr
struct tag is present, use the json
struct tag for naming.
I think json tag can be used as expr tag via config like this:
expr.FieldTag("json")
What do you think?
I tried debugging this and it seems the groups isnt part of the claims at all: you get the equivalent of:
map[email:name@example.com exp:1.720242915e+09 iss:argo-server name:Facy Name preferred_username:name@example.com sub:34343434]
I already root caused it above, and yes it will happen when you have no groups.
In this case I believed I passed in groups from the IDP
Sorry my bad, in my case i was missing the preceding '/' in the userInfoPath resulting in a 404 which was not being raised as an error, returning nil. If I am reading this correctly: https://github.com/argoproj/argo-workflows/blob/main/server/auth/types/claims.go#L102
Thanks for the patience though
userInfoPath resulting in a 404 which was not being raised as an error, returning nil. If I am reading this correctly: https://github.com/argoproj/argo-workflows/blob/main/server/auth/types/claims.go#L102
That's correct for that function, but it's used in HandleCallback
which is supposed to return a 401 in that case 🤔 I wonder if it used your old cookie or something?
The Server logs are really helpful when debugging SSO since there's a lot of redirects, login -> provider -> callback -> login -> home (or other page). But it sounds like you got it figured out
I was able to resolve this on my side by explicitly declaring a group and ensuring that group was present in the claim sent back through our sso provider.
I think either it should be updated to where when no groups match default role is applied, as the documentation suggests would happen - or the documentation updated to call out that the claim must match at least one group to the user even for the default role to apply.
I was able to resolve this on my side by explicitly declaring a group and ensuring that group was present in the claim sent back through our sso provider.
I think either it should be updated to where when no groups match default role is applied, as the documentation suggests would happen - or the documentation updated to call out that the claim must match at least one group to the user even for the default role to apply.
Totally agree. It will more convenient to fall back to default role... This might be optional, like fallback or forbid log in with the corresponding message.
Pre-requisites
:latest
image tag (i.e.quay.io/argoproj/workflow-controller:latest
) and can confirm the issue still exists on:latest
. If not, I have explained why, in detail, in my description below.What happened/what you expected to happen?
Summary
All non-admin users assume a read-only role. After v3.5.5 is applied, non-admin users can no longer use argo-workflows and are prompted with a generic error in the UI:
Admin users can navigate normally.
Argo-server pod logs provide a bit more useful insight:
There aren't any additional logs that are related to this error that I can find.
Testing Performed
To confirm 3.5.5 introduced the issue I tested a few different versions with a read-only test user in a sandbox installation.
Configurations
Argo-Server
workflow-controller-configmap
Server Service Account
Server ClusterRoleBinding
Server ClusterRole
Additional Info
In a Slack convo Anton suggested that https://github.com/argoproj/argo-workflows/pull/12573 may be suspect.
Link to slack message thread
People engaged already:
Happy to provide any more information on request or connect to experiment/work through the issue
Version
v3.5.5
Paste a small workflow that reproduces the issue. We must be able to run the workflow; don't enter a workflows that uses private images.
Logs from the workflow controller
Logs from in your workflow's wait container