envoyproxy / java-control-plane

Java implementation of an Envoy gRPC control plane
Apache License 2.0
293 stars 136 forks source link

Xds v3 #139

Closed sschepens closed 4 years ago

sschepens commented 4 years ago

Following the same implementation pattern than go-control-plane, to support XDS V3 I created a separate cache-v3 and server-v3 packages which are simple copies of the v2 packages but using v3 protos and envoy configured to use v3

slonka commented 4 years ago

Github diff is useless. I will review based on diffing directiries:

git diff --no-index cache cache-v3

cache.diff.txt

-class GroupCacheStatusInfo<T> implements StatusInfo<T> {
+public class GroupCacheStatusInfo<T> implements StatusInfo<T> {

Why did you add public?


-              // TODO: Filter#getConfig() is deprecated, migrate to use Filter#getTypedConfig().
-              structAsMessage(filter.getConfig(), config);
+              HttpConnectionManager config = filter.getTypedConfig().unpack(HttpConnectionManager.class);

Maybe you could change it in v2 as well?


-        .setType(DiscoveryType.STRICT_DNS)
-        .addHosts(Address.newBuilder()
-            .setSocketAddress(SocketAddress.newBuilder()
-                .setAddress(address)
-                .setPortValue(port)
-                .setProtocolValue(Protocol.TCP_VALUE)))
+        .setType(Cluster.DiscoveryType.STRICT_DNS)
+        .setLoadAssignment(createEndpoint(clusterName, address, port))

This was duplicated code right?


Now server:

git diff --no-index server server-v3

server.diff.txt

Seems fine.


Overall good, you could simplify qualifiers because there are a lot of lines like:

-        .setEdsClusterConfig(EdsClusterConfig.newBuilder()
+        .setEdsClusterConfig(Cluster.EdsClusterConfig.newBuilder()

A couple of questions to be answered.

Side note: I know how to produce diffs like that but not everyone, please do not dump 5KLOC PR just like that, try to make it easier for other reviewers.

sschepens commented 4 years ago

@slonka thanks for the review!

I know it's hard to read this kind of PRs, but i think it's kind of inevitable, go-control-plane had the same huge PRs for implementing v3 https://github.com/envoyproxy/go-control-plane/pull/280 for example, has 3k LOC.

Why did you add public?

Don't really remember, i will roll this back

Maybe you could change it in v2 as well?

We could, but this actually doesn't hurt in v2, but has been deprecated in v3, could update it in v2 in a separate PR

This was duplicated code right?

It's not duplicated code, but in v3 the hosts field in Cluster has been removed in favor of in place cluster_load_assignement

Will simplify qualifiers

codecov-commenter commented 4 years ago

Codecov Report

Merging #139 into master will increase coverage by 0.03%. The diff coverage is 88.61%.

Impacted file tree graph

@@             Coverage Diff              @@
##             master     #139      +/-   ##
============================================
+ Coverage     88.55%   88.58%   +0.03%     
- Complexity      207      412     +205     
============================================
  Files            24       48      +24     
  Lines           629     1244     +615     
  Branches         53      106      +53     
============================================
+ Hits            557     1102     +545     
- Misses           54      106      +52     
- Partials         18       36      +18     
Impacted Files Coverage Δ Complexity Δ
...ntrolplane/v3/server/DiscoveryServerCallbacks.java 20.00% <20.00%> (ø) 1.00 <1.00> (?)
...v3/server/serializer/ProtoResourcesSerializer.java 33.33% <33.33%> (ø) 1.00 <1.00> (?)
...xy/controlplane/v3/cache/GroupCacheStatusInfo.java 66.66% <66.66%> (ø) 2.00 <2.00> (?)
...io/envoyproxy/controlplane/v3/cache/Resources.java 69.56% <69.56%> (ø) 15.00 <15.00> (?)
...ver/serializer/CachedProtoResourcesSerializer.java 71.42% <71.42%> (ø) 3.00 <3.00> (?)
.../envoyproxy/controlplane/v3/cache/SimpleCache.java 81.06% <81.06%> (ø) 37.00 <37.00> (?)
...lane/v3/server/DiscoveryRequestStreamObserver.java 85.18% <85.18%> (ø) 20.00 <20.00> (?)
.../io/envoyproxy/controlplane/v3/cache/Snapshot.java 89.47% <89.47%> (ø) 24.00 <24.00> (?)
...proxy/controlplane/v3/cache/SnapshotResources.java 92.30% <92.30%> (ø) 8.00 <8.00> (?)
...v3/server/callback/SnapshotCollectingCallback.java 95.34% <95.34%> (ø) 14.00 <14.00> (?)
... and 38 more

Continue to review full report at Codecov.

Legend - Click here to learn more Δ = absolute <relative> (impact), ø = not affected, ? = missing data Powered by Codecov. Last update df5fac6...f98f7c8. Read the comment docs.

chemicL commented 4 years ago

I appreciate all the hard work you put into this.

My concern though is that the copy-paste approach will make it very difficult to maintain. I'd rather see a common facade which would implement the important caching logic whenever possible. Separating the core of this project from the particular XDS version should be possible unless it enforces a complete redesign of this project. From a quick skim it seems that it's not completely out of reach. We might need to add some custom improvements to the core logic of the control plane and we'd either duplicate the work (and possibly not be able test v3 in real life before we migrate), or we'd just improve one part (which one?).

I'm not completely against the approach you've taken, I see the pragmatism in it, however if this is what needs to be done, I'd love to see a documented strategy/contract for adding improvements with regards to how both versions are treated.

sschepens commented 4 years ago

@chemicL I know that's not the best option, but implementing a version-agnostic option would probably involve a lot more work, I think that generics would not solve the problems we need, we would probably need to wrap some classes into interfaces.

I actually didn't intend this to get merged, we just went ahead and did it since we were needing it and it seemed go-control-plane was headed in the same direction.

mpuncel commented 4 years ago

@chemicL's comment is spot on regarding the pragmatism but also concern of duplicating critical logic.

It does seem like a lot of the new files don't actually import any proto messages, and in some cases they only import them for a doc comment. That would make it seem like we could make a "cache-core" module with shared functionality around stream management, watch status, etc.

@sschepens Are you planning any more work on this PR, or should others build on this to try to get to a more minimal and merge-able change (or reach the same conclusion you have that the best path forward is duplication)

mpuncel commented 4 years ago

FYI I'll probably take a whack at a PR at the other extreme of reusing as much code as possible at the expense of a bunch of generics which might make the code less understandable. I'm not sure if that will be better or not but then we can compare approaches

chemicL commented 4 years ago

@mpuncel that sounds like a good idea and may pay off in the future api versions too. Are you still planning to submit a PR? If so, when do you suppose to find the time for it? We will need to do it some time later in this quarter or beginning of next, but if you are already working on it, there's no use duplicating the work.

mpuncel commented 4 years ago

@chemicL I put up my alternative PR, https://github.com/envoyproxy/java-control-plane/pull/140