cujojs / most

Ultra-high performance reactive programming
MIT License
3.49k stars 206 forks source link

distinct is ambiguous #94

Closed aaronshaf closed 9 years ago

aaronshaf commented 9 years ago

With:

stream:            -1-2-2-1-4-4-5->

Some might expect:

stream.distinct(): -1-2-----4---5->

But distinct only concerns adjacent duplicates. Perhaps you could use distinct for removing lifetime duplicates, and distinctUntilChanged (cf. RxJS) for removing adjacent duplicates?

aaronshaf commented 9 years ago

Or perhaps keep the current distinct functionality, and add a unique method.

briancavalier commented 9 years ago

Hey @aaronshaf, sorry for any confusion. I like the name distinct because it's short, but I do agree with you that some folks might interpret it to mean "globally unique". I'm not crazy about distinctUntilChanged, and I think we can probably come up with a good name that conveys "remove adjacent duplicates". Here's some brainstorming:

Thoughts on those? Other ideas?

briancavalier commented 9 years ago

and add a unique method.

Do you have use cases in mind for a "globally unique" combinator? One potential issue is that for infinite streams, the bookkeeping required for global uniqueness (a set or map) can grow infinitely. I wonder if "recently unique" is useful (with some configurable definition of "recent")??

aaronshaf commented 9 years ago

skipDuplicates and skipRepeats sound great. And having overlap with bacon is a plus.

You're probably right about the bookkeeping cost of .unique -- probably not good to encourage its use. One can implement it anyway with .filter and some state.

unscriptable commented 9 years ago

skipSubsequent() popped into my head as potentially the most semantically correct. However, now my brain says "skip subsequent WHAT?"

To help out with Aaron's filter idea, what if we handled the state for the user? Something like this:

skipAdjacent((a, b) => a === b)
briancavalier commented 9 years ago

@unscriptable At first, I thought you meant the operation currently known as distinctBy, but now I realize you may be saying something different. Do you mean something like loop, but rather than transforming, it would filter?

briancavalier commented 9 years ago

skipDuplicates and skipRepeats sound great. And having overlap with bacon is a plus.

@aaronshaf Cool, thanks. I feel like skipRepeats conveys the behavior more closely than skipDuplicates. I agree both are better than distinct. While it's not necessarily a goal of most.js to have API compatibility with other impls, I can appreciate the learnability advantages to the community. I'd be ok going with skipDuplicates.

unscriptable commented 9 years ago

Yes. I just reinvented distinctBy. doh.

briancavalier commented 9 years ago

@unscriptable the "loop-like stateful filter" idea is actually kinda cool, imho. You can do it via capturing right now, but if it's something people seem to do often, we could consider providing it.

briancavalier commented 9 years ago

I had another thought recently: changes(). It sort of comes at the problem from a different angle: instead of saying what to skip, it describes what the resulting stream is. On one hand, I like that. On the other, there are a couple issues: 1) it's a noun, which don't make great function names, 2) some other libs use the method name changes to turn a Property/Behavior (ie continuous signal) into an event stream.

Is there a variant on the word "change" that might work here?

unscriptable commented 9 years ago

I agree, "changes" is a really great noun.

Hmmm... I'm a bit worried about API bloat. Does it make sense to do something like this?

var changes = most.filters.changes;
myChanges = myStream.filter(changes);
briancavalier commented 9 years ago

@unscriptable That is interesting! If we had a stateful filtering function (like loop, but for filtering rather than transforming), it could be implemented quite efficiently (i.e. without capturing). I need to think on that a bit more.

In the meantime, a couple more name ideas: takeChanges, compress

briancavalier commented 9 years ago

Another data point: @gozala's transducer lib calls it dropRepeats, which would translate into skipRepeats in most.js terminology. That's still one of my favorite names so far.

briancavalier commented 9 years ago

I went with skipRepeats in #119. If you feel strongly that there's a better name, please post there asap!

unscriptable commented 9 years ago

I just noticed that Haskell uses the word "nub" (which means "essence"). It's more like the .unique() function described by @aaronshaf, though.