apache / polaris

Apache Polaris, the interoperable, open source catalog for Apache Iceberg
https://polaris.apache.org/
Apache License 2.0
1.17k stars 130 forks source link

[FEATURE REQUEST] Develop Apache Ranger Plugin for Polaris to Enhance Access Control for Apache Iceberg #274

Open dbosco opened 2 months ago

dbosco commented 2 months ago

Is your feature request related to a problem? Please describe.

No response

Describe the solution you'd like

Apache Polaris provides metadata management for Apache Iceberg. From the authorization point of view, key features of Polaris include:

Objective:

To enhance the usability and security of Polaris for Apache Iceberg users, the request is to develop an Apache Ranger plugin that integrates Polaris' access control features with Apache Ranger. This integration will allow for centralized and consistent management of access policies, audit logging, and fine-grained access control across different tools used with Apache Iceberg.

Use Cases:

  1. Centralized Access Policy Management:

    • Implement centralized and consistent management of access policies for data stored using Apache Iceberg across multiple tools and environments.
  2. Access Control for Data Engineering Workloads:

    • Manage and control access to datasets used by Data Engineering workloads (e.g., Apache Spark) with a coarser-grained approach at the table level.
  3. Fine-Grained Access Control for Data Analysts:

    • Provide fine-grained access control for Data Analysts using compute engines like Trino. This control can be enforced by leveraging the native Ranger Plugin in Trino, allowing for more granular control over data access at the table, view, or even column level.
  4. Centralized Access Auditing:

    • Enable centralized collection and analysis of access audit logs across all tools used to access datasets in Iceberg, ensuring comprehensive auditing and compliance.

Expected Deliverables:

Describe alternatives you've considered

No response

Additional context

References

PolarisAuthorizer Class on GitHub: The PolarisAuthorizer class provides the core authorization logic in Polaris, which can be leveraged by the Apache Ranger plugin.

Most Apache projects and Open Source projects like Presto (https://prestodb.io/docs/current/connector/hive-security.html#ranger-based-authorization) , Trino (https://github.com/trinodb/trino/issues/22674), Apache Hive (https://github.com/apache/ranger/tree/master/hive-agent), Apache Kafka (https://cwiki.apache.org/confluence/display/RANGER/Kafka+Plugin have native integration with Apache Ranger. Some of these might also benefit with this integration

A corresponding tracking JIRA is also created in the Apache Ranger project. https://issues.apache.org/jira/browse/RANGER-4910

csi-mboero commented 1 month ago

Hello all, I share a strong interest in this issue and I'm looking forward to any updates or insights. Thanks for addressing it!

sankalp-vairat commented 1 month ago

Hello, This feature would be extremely beneficial for implementing fine-grained access control for Apache Iceberg. Looking forward to updates !

jbonofre commented 1 month ago

Clearly a great proposal (and planned to be honest). We love contribution ;)

dbosco commented 1 month ago

@jbonofre I am happy to start working on some design considerations. Let me know if Polaris is following any design template that I can follow, or I start with an initial document and we can then iterate over it. Thanks