NationalSecurityAgency / datawave

DataWave is an ingest/query framework that leverages Apache Accumulo to provide fast, secure data access.
https://code.nsa.gov/datawave
Apache License 2.0
565 stars 246 forks source link

True Unique capability #2635

Open ivakegg opened 1 week ago

ivakegg commented 1 week ago

The request is to be able to modify the unique capability to return documents that are truely unique within the context of the results. This means that only those unique results that actually only existed once in the non-uniquified result set are returned. I am thinking that this can be extended to specify a max-count on the unique results such that those unique results that only occured up to the specfied max-count are actually returned.

ivakegg commented 1 week ago

Perhaps #MAX_UNIQUE_COUNT(1) && #UNIQUE(FIELD1,FIELD2)