googleapis / gapic-generator-python

Generate Python API client libraries from Protocol Buffers.
Apache License 2.0
116 stars 66 forks source link

Resource path helper functions collide when resources in different packages share a name #1060

Open parthea opened 2 years ago

parthea commented 2 years ago

During review of https://github.com/googleapis/python-aiplatform/pull/777, @nicain found that function test_dataset_path is defined 3 times in file test_dataset_path.py here, here and here.

busunkim96 commented 2 years ago

I remember talking about this with @dizcology before - I believe the cause is multiple resources namedDataset referenced in that service.

option (google.api.resource_definition) = {
  type: "automl.googleapis.com/Dataset"
  pattern: "projects/{project}/locations/{location}/datasets/{dataset}"
};
option (google.api.resource_definition) = {
  type: "datalabeling.googleapis.com/Dataset"
  pattern: "projects/{project}/datasets/{dataset}"
};
message Dataset {
  option (google.api.resource) = {
    type: "aiplatform.googleapis.com/Dataset"
    pattern: "projects/{project}/locations/{location}/datasets/{dataset}"
  };

I believe this is the only API where this occurs.

@software-dov Is there a good way to prevent these resources from clobbering each other?

tseaver commented 2 years ago

Not only is the testcase triplicated, so are the dataset_path and parse_dataset_path methods (see https://github.com/googleapis/python-aiplatform/issues/897): the generated is capable of reordering them, which is a breaking change.

software-dov commented 2 years ago

@software-dov Is there a good way to prevent these resources from clobbering each other?

I don't know that there's a perfect-yet-backwards-compatible solution. We can definitely sort the resources by name to maintain stable order and easily detect collisions. The two main options I can think of are: 1) Pick one resource if there's a name collision 2) Disambiguate names if there's a collision

I'm guessing most people would prefer 2.

busunkim96 commented 2 years ago

@software-dov Yep I vote for 2.

Can the resources from other APIs be disambiguated with the service name?

Luckily it looks like the aiplatform version of dataset (projects/{project}/locations/{location}/datasets/{dataset}) is the last to be defined in the current published library, so the change would not be breaking.

software-dov commented 2 years ago

@busunkim96 Seems like a good solution. More generally, we can make a map from resource name to a list of resources sorted on proximity (ours vs theirs) then by name.

atulep commented 2 years ago

@parthea any updates

parthea commented 2 years ago

This work is in progress and planned for the current iteration with a target date of May 13th