Open gviedma opened 6 months ago
@Jackie-Jiang / @xiangfu0 / @snleee
Lately we have seen several dependency related issues that fail the build of our internal applications when the build job attempts to link them with OSS jars.
We have mostly ended up making fixes to exclude and / or pin versions etc in the build pipeline that brings in the OSS jars into our artifactory.
Through this issue and the proposal, we wanted to discuss and see if we can try to reduce such problems in the future
Not sure if others have seen any recent similar issues cc @ankitsultana
Agree on this approach. @gviedma Can you share a link for the Maven's best practice for Dependency Management? It will help everyone onboard to this standard. We are kind of following this approach recently when addressing dependency vulnerabilities, and strong +1 on enforcing some rules for future dependency management
@Jackie-Jiang You can find the best practices under the Maven dependency management guide. The following sections are of particular relevance:
dependencyManagement
section to control the versions of artifacts used in transitive dependencies.Perhaps this is something that can be enforced during the PR review process or even automated via a linting mechanism.
This looks good to me. It would be great if we can put these guidelines in the Apache Pinot Wiki as well.
@gviedma : are you also planning to restructure/fix all of the existing dependencies? I am wondering if there's a way to ensure that after the restructuring there won't be any breaking changes. (e.g. if we switch to a bom dependency for grpc, it could potentially change some existing dependency which can cause a regression)
Here is my +100 to this approach. Curious on the best practice for managing the plugins/connectors to handle multiple versions of same library, e.g. Spark2 and Spark3 support, etc.?
I'm fine with this approach. Maybe I would suggest to:
dependencyManagement
section to import that bom. This should be effectively the same as we have right now, but:
scope
dependency3.2.1 should reduce the total package size, the build time and possible errors related to shading. 3.2.2 should reduce errors related to shading.
i'm very interested in helping here. this has been something i've been working on in various forms on the side. i just submitted a PR to move versions to properties in the main pom as a first step in getting better organized there. see #12736
@timveil - thanks for expressing interest. Just to confirm - are you open to working on action items coming out of this ?
@gviedma - Looks like there is overall consensus. I think it will be great if we can may be create concrete issues or at least a bullet list of things to work on to get to the desired state.
It will help @timveil or others to pick up the work. I can also help distribute / find people.
It would be great if we can put these guidelines in the Apache Pinot Wiki as well.
@ankitsultana - Sure we can do that sometime soon.
I am wondering if there's a way to ensure that after the restructuring there won't be any breaking changes. (e.g. if we switch to a bom dependency for grpc, it could potentially change some existing dependency which can cause a regression)
@ankitsultana - I would suggest doing this incrementally and hopefully the result is not worse than what already is until we get to the final state.
@timveil - thanks for expressing interest. Just to confirm - are you open to working on action items coming out of this ?
yes. i have a strong interest in improving dependency management for pinot
Here's what I would like to suggest for the next steps:
I can volunteer to take on item #1 to formalize the current proposal and incorporate @gortiz's suggestions. The remaining work items 2 and 3 can be distributed among various folks. I can also help with 2 and 3 as needed and happy to review as well cc @siddharthteotia
Aligned on the next steps.
QQ - At the end, don't we need to change anything in the PR review steps / github actions to ensure the guidelines are followed ?
@gviedma - Let's try to publish (1) soon please. I see multiple dependency upgrade PRs opened and would like them to follow the process.
I have published a set of guidelines along with examples in the following doc Please review and leave comments when you get a chance. Once we have enough consensus we can reference this doc from the Contributing Guidelines.
FYI @gortiz I attempted to incorporate your BOM subproject idea and import it from the Pinot POM, but that is explicitly disallowed by Maven's dependency management:
Do not attempt to import a POM that is defined in a submodule of the current POM. Attempting to do that will result in the build failing since it won't be able to locate the POM.
I will follow up with you separately on this and to ensure I captured your plugin dependency suggestions correctly in my doc. Thanks!
cc @siddharthteotia
FYI @gortiz I attempted to incorporate your BOM subproject idea and import it from the Pinot POM, but that is explicitly disallowed by Maven's dependency management:
I see. I didn't know that limitation, but it makes sense. We can find a way to create that POM. That will be specially useful for people that want to create their own plugins outside the Pinot repo, but I guess that pattern is not very common and they can still just declare the Pinot POM as parent, so I would vote +1 to the solution proposed by @gviedma.
BTW I've added some comments on the linked document, but I think it is also fine and it would be great to reference that doc from the contribution guidelines.
I've created a PR to incorporate the dependency management guidelines into the pinot-docs repository. Once that is merged, I will cut a second PR against the main pinot repo to update https://github.com/apache/pinot/blob/master/CONTRIBUTING.md and point at the new dependency management guidelines cc @siddharthteotia
Problem Statement
Pinot’s approach to dependency management can be ad-hoc and inconsistent, with dependency versions being resolved in multiple different ways:
The lack of consistency results in the following problems:
Proposal
Ideally, Pinot should leverage Maven’s best practices for Dependency Management. At a high level, this consists of standardizing along the following areas:
We should also consider enforcing the above recommendation via a linting mechanism to ensure consistency.
Example of How to Specify a Dependency Version
For example, do not do this in a submodule:
Instead, follow this pattern in the root pom: