streamnative / pulsar-archived

Apache Pulsar - distributed pub-sub messaging system
https://pulsar.apache.org
Apache License 2.0
72 stars 25 forks source link

ISSUE-13441: feat(offloader): support kerberos config set for hdfs offloader #3463

Open sijie opened 2 years ago

sijie commented 2 years ago

Original Issue: apache/pulsar#13441


Is your feature request related to a problem? Please describe.

Usually hdfs is kerberos is enabled in production environment.

Describe the solution you'd like

Support kerberos config set for hdfs offloader.

batmanneverdie commented 2 years ago

Pulsar offload data to HDFS is failed but "pulsar-admin topics info-internal xx" command display "true"

Enviroment

pulsar version: 2.9.1
pulsar depoly: pulsar-all + docker
hdfs: using Kerberos auth

Description

  1. I want to test the offloader feature in puslar. So I refer to this essay from StreamNative: Filesystem offloader
  2. The only diffirent is my HDFS have Kerberos auth.
  3. I don't know how to set HDFS Kerberos config in pulsar offload, so the offload is failed,but when I using command pulsar-admin topics info-internal public/default/fs-test to check the topic info, isOffloadedfiled is true.
    root@28fd0b683087:/pulsar/bin# ./pulsar-admin topics info-internal public/default/fs-test
    {
    "version": 3,
    "creationDate": "2022-02-17T07:38:43.117Z",
    "modificationDate": "2022-02-17T07:45:00.693Z",
    "ledgers": [
        {
            "ledgerId": 8,
            "isOffloaded": false
        },
        {
            "ledgerId": 7,
            "entries": 5001,
            "size": 453168,
            "isOffloaded": true
        }
    ],
    "cursors": {}
    }

Pulsar-manager UI display error and pulsar-admin topics stats-internal public/default/fs-test is false:

root@28fd0b683087:/pulsar/bin# ./pulsar-admin topics stats-internal public/default/fs-test
{
  "entriesAddedCounter" : 9000,
  "numberOfEntries" : 9000,
  "totalSize" : 815544,
  "currentLedgerEntries" : 3999,
  "currentLedgerSize" : 362376,
  "lastLedgerCreatedTimestamp" : "2022-02-17T07:39:53.418Z",
  "waitingCursorsCount" : 0,
  "pendingAddEntriesCount" : 0,
  "lastConfirmedEntry" : "8:3998",
  "state" : "LedgerOpened",
  "ledgers" : [ {
    "ledgerId" : 7,
    "entries" : 5001,
    "size" : 453168,
    "offloaded" : false,
    "underReplicated" : false
  }, {
    "ledgerId" : 8,
    "entries" : 0,
    "size" : 0,
    "offloaded" : false,
    "underReplicated" : false
  } ],
  "cursors" : { },
  "schemaLedgers" : [ ],
  "compactedLedger" : {
    "ledgerId" : -1,
    "entries" : -1,
    "size" : -1,
    "offloaded" : false,
    "underReplicated" : false
  }
}

BTW

I want to know how to offload pulsar data to HDFS with Kerberos auth, please give me same reference.