mkodekar / guava-libraries

Automatically exported from code.google.com/p/guava-libraries
Apache License 2.0
0 stars 0 forks source link

Canonicalize a whole RangeSet at once #1832

Closed GoogleCodeExporter closed 9 years ago

GoogleCodeExporter commented 9 years ago
I recently noticed/realized that RangeSets over discrete numbers aren't fully 
reduced. I fully understand why this happens (to handle continuous ranges, such 
as Floats), however there are cases when it would be really useful to fully 
reduce a discrete RangeSet. My current use case I'd like to summarize/log a 
bunch of Long IDs that am storing in a database. Adding `Range#singleton`s will 
not suffice since the ranges aren't reduced.

To workaround this, I wrote a piece of code that does exactly this:

    private void reduceRangeSet(RangeSet<Long> rangeSet) {
        restart: while (true) {
            Set<Range<Long>> ranges = rangeSet.asRanges();
            if (ranges.size() <= 1)
                break;
            Range<Long> prevRange = null;
            for (Range<Long> range : ranges) {
                if (prevRange != null && range.contains(range.lowerEndpoint()) // Takes care of checking open/close range end
                        && prevRange.contains(range.lowerEndpoint() - 1)) {
                    rangeSet.add(Range.closedOpen(range.lowerEndpoint() - 1, range.lowerEndpoint()));

                    // Must restart outer loop since rangeSet.asRanges() might not update on changes.
                    continue restart;
                }
                prevRange = range;
            }

            // If we came here we found nothing to reduce.
            break;
        }
    }

The code does it's job for me, but I suggest it's rewritten to not use nested 
loops if possible.

A proposal would be to add something similar to maybe 
"RangeSets.reduce(RangeSet<Long> ..." and possibly also add 
"RangeSets.reduce(RangeSet<Integer> ..." and maybe even 
"RangeSets.reduce(RangeSet<Byte> ...".

Original issue reported on code.google.com by jens.ran...@tink.se on 12 Aug 2014 at 1:59

GoogleCodeExporter commented 9 years ago
It's not clear to me why you'd use this approach, or why you'd special-case for 
all these types.  The idea is intended to be that you call 
Range.canonical(DiscreteDomain), passing in the appropriate DiscreteDomain for 
your type, before adding or removing it from the RangeSet.  

If you canonicalize all Ranges you use to modify your RangeSet, your RangeSet 
should only contain canonical ranges no matter how ranges get merged, 
subtracted, etc.

Is there a reason that solution doesn't work for you?

Original comment by lowas...@google.com on 12 Aug 2014 at 6:32

GoogleCodeExporter commented 9 years ago
Ignorant question: would it make any sense to add a 
canonicalize(DiscreteDomain) method to RangeSet, which would basically replace 
every Range in the set with its canonical version? Maybe that would result in a 
different set of ranges than you'd get if you only added canonical ranges to 
the set in the first place?

Original comment by cgdecker@google.com on 12 Aug 2014 at 7:10

GoogleCodeExporter commented 9 years ago
lowas: Did not know about canonical. It solved my issue. In my case calling 
canonical for every add is a better solution as I'm using RangeSet to reduce 
memory pressure.

cgdecker: A canonicalize method is, after my fix, not necessary. So far, I have 
not had a use case for that now that I know of canonical form of a Range.

Original comment by jens.ran...@tink.se on 15 Aug 2014 at 5:42

GoogleCodeExporter commented 9 years ago
I love the happy resolution to your problem.

Let's keep this open to consider the value of a RangeSet canonicalization 
function.

Original comment by kevinb@google.com on 15 Aug 2014 at 5:52

GoogleCodeExporter commented 9 years ago
This issue has been migrated to GitHub.

It can be found at https://github.com/google/guava/issues/<issue id>

Original comment by cgdecker@google.com on 1 Nov 2014 at 4:08

GoogleCodeExporter commented 9 years ago

Original comment by cgdecker@google.com on 1 Nov 2014 at 4:17

GoogleCodeExporter commented 9 years ago

Original comment by cgdecker@google.com on 3 Nov 2014 at 9:07