apache / gravitino

World's most powerful open data catalog for building a high-performance, geo-distributed and federated metadata lake.
https://gravitino.apache.org
Apache License 2.0
1.1k stars 344 forks source link

[#5336] feat(auth-ranger): Remove MANAGED_BY_GRAVITINO limit and compatible for existing ranger policy #5629

Open theoryxu opened 4 days ago

theoryxu commented 4 days ago

What changes were proposed in this pull request?

Many clients and users have used Ranger for a while. Gravitino should be compatible with these cases.

There are some principles Gravitino needs to follow when it pushes down policies:

  1. Gravitino can't modify existing policy names because users may have their own name rules.
  2. Gravitino and users could share the same policy and not disturb each other for the same resource.

For the target, this PR includes the following changes:

  1. wildcardSearchPolies removes the MANAGED_BY_GRAVITINO filter.
  2. Gravitino managed role name add the prefix GRAVITINO_.
  3. Using Gravitino Managed role to identify and operate policy items.

Despite doing these, users should be cautious about directly managing the ranger policy. There are some restricts:

  1. Don't directly rename Gravitino-managed policies.
  2. Don't directly modify policy resources in the policy that have Gravitino Managed roles.
  3. Don't directly modify policy items that have Gravitino Managed roles.

Why are the changes needed?

Fix: #5336

Does this PR introduce any user-facing change?

N/A

How was this patch tested?

Added ITs

xunliu commented 4 days ago

hi @theoryxu Thank you for your attention to this problem

The problem now is that Gravitino will only maintain a Ranger Policy with the MANAGED_BY_GRAVITINO label, but if a user already has a Ranger service, that can lead to conflicts.

  1. Gravitino's Policy has its own set of management rules. May conflict with the user randomly set;
  2. Therefore, only a Ranger Policy with the MANAGED_BY_GRAVITINO label is maintained.

But that's a pretty big limitation.

  1. There is only one Ranger policy for each resource (db1.tab1).
  2. If a user's old ranger service already has the db1.tab1 policy, but this policy may not conform to Gravitino's authority specification, and there may be problems if Gravitino is asked to directly update this Policy. 3, so now the Gravitino through RangerHelper.WildcardSearchPolies() function will only find the policy with MANAGED_BY_GRAVITINO label.
  3. If the old ranger already has this policy, but Gravitino cannot operate it, there will be problems.