apache / pulsar

Apache Pulsar - distributed pub-sub messaging system
https://pulsar.apache.org/
Apache License 2.0
14.26k stars 3.59k forks source link

[fix][misc] Unable to connect an etcd metastore with recent releases due to jetc-core sharding problem #23604

Closed Shawyeok closed 2 days ago

Shawyeok commented 4 days ago

Fixes #23513

With recent pulsar releases, it raise NoClassDefFoundError if setup with a etcd metadata.

docker run -it --rm apachepulsar/pulsar:4.0.0 bin/pulsar standalone --metadata-url etcd:http://a-etcd:2379
docker run -it --rm apachepulsar/pulsar:3.0.6 bin/pulsar standalone --metadata-url etcd:http://a-etcd:2379

Motivation

The jetcd-core-shaded module was introduced in #22892 to address the compatibility issues between jetcd-core’s grpc-java dependency and Netty. You can find more details here and in the grpc-java documentation.

Currently, we use unpack-shaded-jar execution unpacks the shaded jar produced by maven-shade-plugin:shade into the jetcd-core-shaded/target/classes directory. However, the classes in this directory conflict with its dependencies. If the maven-shade-plugin:shade runs again without cleaning this directory, it can produce an incorrect shaded jar. You can replicate and verify this issue with the following commands:

# Step 1: Clean the build directory
mvn clean

# Step 2: Perform an install and unpack the shaded jar into a directory.
# Verify the import statement for `io.netty.handler.logging.ByteBufFormat` in 
# `org/apache/pulsar/jetcd/shaded/io/vertx/core/net/NetClientOptions.class`. 
# The correct import should be: 
# `import io.grpc.netty.shaded.io.netty.handler.logging.ByteBufFormat;`.
mvn install
unzip $M2_REPO/org/apache/pulsar/jetcd-core-shaded/4.1.0-SNAPSHOT/jetcd-core-shaded-4.1.0-SNAPSHOT-shaded.jar \
  -d jetcd-core-shaded/target/first-classes

# Step 3: Run the install command again without cleaning.
# The unpacked jar from the previous step will persist in `jetcd-core-shaded/target/classes`. 
# Unpack the shaded jar into a different directory (e.g., second-classes) and check the import.
# The incorrect import will be: 
# `import io.grpc.netty.shaded.io.grpc.netty.shaded.io.netty.handler.logging.ByteBufFormat;`.
mvn install
unzip $M2_REPO/org/apache/pulsar/jetcd-core-shaded/4.1.0-SNAPSHOT/jetcd-core-shaded-4.1.0-SNAPSHOT-shaded.jar \
  -d jetcd-core-shaded/target/second-classes

# Step 4: Use IntelliJ IDEA's "Compare Directories" tool to compare the `first-classes` 
# and `second-classes` directories. The differences in imports should become apparent.

A simpler solution is to remove the configurations related to attach and unpack. IntelliJ IDEA assumes the shaded jar path is target/${artifactId}-${version}.jar. However, in Pulsar’s build system, the finalName is set to just ${artifactId} in the parent pom.xml. While I’m unsure of the reasoning behind this setup, we can override the finalName in jetcd-core-shaded/pom.xml. This is the approach I’ve taken in this patch.

Verifying this change

This issue typically cannot be detected by CI tests, as CI environments always run in a clean workspace. To address this, we could refine our release guidelines to include a step for cleaning the workspace before deploying artifacts. Additionally, incorporating automated checks into the release validation process could help catch such issues early.

Does this pull request potentially affect one of the following parts:

If the box was checked, please highlight the changes

Documentation

Matching PR in forked repository

PR in forked repository: https://github.com/Shawyeok/pulsar/pull/18

lhotari commented 3 days ago

A simpler solution is to remove the configurations related to attach and unpack. IntelliJ IDEA assumes the shaded jar path is target/${artifactId}-${version}.jar. However, in Pulsar’s build system, the finalName is set to just ${artifactId} in the parent pom.xml. While I’m unsure of the reasoning behind this setup, we can override the finalName in jetcd-core-shaded/pom.xml. This is the approach I’ve taken in this patch.

Thank you, @Shawyeok! Great work. How did you find out that IntelliJ expects this format? Could we remove overriding of finalName in pom.xml so that we'd use the default everywhere? It seems that the finalName was overridden already in the initial commit of Pulsar.

lhotari commented 3 days ago

@Shawyeok Please submit a similar change to BookKeeper since there's a similar solution to shade jetcd-core in metadata-drivers/jetcd-core-shaded/pom.xml.

Shawyeok commented 3 days ago

@lhotari

How did you find out that IntelliJ expects this format?

There is a silent error in the pulsar-metadata module dependencies, which I discovered by coincidence.

image_1731639655772_0

image

Could we remove overriding of finalName in pom.xml so that we'd use the default everywhere? It seems that the finalName was overridden already in the initial commit of Pulsar.

I tried once, but other configurations depend on the current finalName setting, such as: https://github.com/apache/pulsar/blob/81385c5f971913f841d5637c7b8e103138fd76cc/distribution/server/src/assemble/bin.xml#L89-L89

I wasn’t sure what was beneath the surface :-) , so I opted for a conservative approach.

Shawyeok commented 3 days ago

@Shawyeok Please submit a similar change to BookKeeper since there's a similar solution to shade jetcd-core in metadata-drivers/jetcd-core-shaded/pom.xml.

Sure, will do it tomorrow.

codecov-commenter commented 2 days ago

Codecov Report

All modified and coverable lines are covered by tests :white_check_mark:

Project coverage is 74.35%. Comparing base (bbc6224) to head (8cb145a). Report is 731 commits behind head on master.

Additional details and impacted files [![Impacted file tree graph](https://app.codecov.io/gh/apache/pulsar/pull/23604/graphs/tree.svg?width=650&height=150&src=pr&token=acYqCpsK9J&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=apache)](https://app.codecov.io/gh/apache/pulsar/pull/23604?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=apache) ```diff @@ Coverage Diff @@ ## master #23604 +/- ## ============================================ + Coverage 73.57% 74.35% +0.78% - Complexity 32624 34443 +1819 ============================================ Files 1877 1944 +67 Lines 139502 147127 +7625 Branches 15299 16225 +926 ============================================ + Hits 102638 109398 +6760 - Misses 28908 29293 +385 - Partials 7956 8436 +480 ``` | [Flag](https://app.codecov.io/gh/apache/pulsar/pull/23604/flags?src=pr&el=flags&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=apache) | Coverage Δ | | |---|---|---| | [inttests](https://app.codecov.io/gh/apache/pulsar/pull/23604/flags?src=pr&el=flag&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=apache) | `27.57% <ø> (+2.98%)` | :arrow_up: | | [systests](https://app.codecov.io/gh/apache/pulsar/pull/23604/flags?src=pr&el=flag&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=apache) | `24.40% <ø> (+0.08%)` | :arrow_up: | | [unittests](https://app.codecov.io/gh/apache/pulsar/pull/23604/flags?src=pr&el=flag&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=apache) | `73.72% <ø> (+0.87%)` | :arrow_up: | Flags with carried forward coverage won't be shown. [Click here](https://docs.codecov.io/docs/carryforward-flags?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=apache#carryforward-flags-in-the-pull-request-comment) to find out more. [see 656 files with indirect coverage changes](https://app.codecov.io/gh/apache/pulsar/pull/23604/indirect-changes?src=pr&el=tree-more&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=apache)