apache / accumulo-access

Apache Accumulo Access Control Library
https://accumulo.apache.org
Apache License 2.0
4 stars 11 forks source link

Duplicate authorization will throw an Illegal Argument Exception #66

Closed phrocker closed 6 months ago

phrocker commented 6 months ago

https://github.com/apache/accumulo-access/blob/2da31d3643916c020b14c9e61fa632543dbc56b7/src/main/java/org/apache/accumulo/access/Authorizations.java#L67

Set.of will throw an IllegalArgumentException as expected, but as a consumer to the API we should avoid the exception and prune duplicates .

Would love thoughts. I think it's reasonable to suggest that the client of the Authorizations API should do this work.

I'll submit a PR for this trivial change if there is any concurrence from @keith-turner or @dlmarion that we can de-dupe in the of helper method in Authorization

dlmarion commented 6 months ago

@phrocker - do you know if Set.copyOf does the same?

Update: I looked, it looked like it does not.

dlmarion commented 6 months ago

The other API objects have good javadoc coverage, I wonder if that's what is missing here.

phrocker commented 6 months ago

The other API objects have good javadoc coverage, I wonder if that's what is missing here.

I think that is fair. If I knew this beforehand I would have guarded against it.

Apache Accumulo leverages a hashset which guards against this with the linked code:

private static Set<ByteSequence> createInternalSet(int size) {
    if (size < 1) {
      return EMPTY_AUTH_SET;
    } else {
      return new HashSet<>(size);
    }
  }

  private static List<byte[]> createInternalList(int size) {
    if (size < 1) {
      return EMPTY_AUTH_LIST;
    } else {
      return new ArrayList<>(size);
    }

  }

  /**
   * Constructs an authorization object from a collection of string authorizations that have each
   * already been encoded as UTF-8 bytes. Warning: This method does not verify that each encoded
   * string is valid UTF-8.
   *
   * @param authorizations collection of authorizations, as strings encoded in UTF-8
   * @throws IllegalArgumentException if authorizations is null
   * @see #Authorizations(String...)
   */
  public Authorizations(Collection<byte[]> authorizations) {
    checkArgument(authorizations != null, "authorizations is null");
    this.auths = createInternalSet(authorizations.size());
    this.authsList = createInternalList(authorizations.size());
    for (byte[] auth : authorizations) {
      auths.add(new ArrayByteSequence(auth));
    }
    checkAuths();
  }
dlmarion commented 6 months ago

Set.copyOf also creates a HashSet from the collection, then calls Set.of. If we put the responsibility on the user to provide a deduplicated array when calling Authorizations of(String... authorizations), then it's likely that they would use a Set and call Authorizations of(Collection<String> authorizations) instead.

I wonder if we should remove the two of implementations and do the following. This would reduce the extra HashSet object allocation if we called Set.copyOf.

  public static Authorizations of(HashSet<String> authorizations) {
    return new Authorizations(Set.of(authorizations.toArray()));
  }
keith-turner commented 6 months ago

I did not realize Set.of would throw an exception when there are dupes. We could do the following in the impl

  public static Authorizations of(String... authorizations) {
    return new Authorizations(Set.copyOf(Arrays.asList(authorizations));
  }

Arrays.asList() wraps the array and does not copy it.