cujojs / most

Ultra-high performance reactive programming
MIT License
3.5k stars 231 forks source link

Cyclical subscriptions and chunking #501

Closed myndzi closed 6 years ago

myndzi commented 6 years ago

I'm investigating most.js, but having trouble with two patterns and can't seem to figure out from the docs or searching whether they're unsupported or I just can't tell how to perform them. I'm hoping you can clarify either what should be used or that most.js is not the correct library.

Background: I want to process links on a website. I'm making a request to load a page, parsing that out into a stream of html element objects, and transforming the result stream to collect information.

1) Cyclical subscriptions For a given site, I want to find links on the page; for each link, I want to conditionally perform the same task: if it's an on-site link, I want to (at some point later) process this page if I haven't seen it before. If it's an off-site link, I want to check if it's valid.

I've already kind of had to hack things by creating an event emitter to act as a proxy to get a stream of elements from the request -> parser pipeline, but I can't work how to feed the on-site links back into the process.

2) Chunking It seems that I want something close to chain, but without the merge functionality. For each stream of objects, I want to apply selection criteria, outputting a higher order stream for which each item is a span of items from the original stream. So, if the input stream is equivalent to [a, b, c, d, e, f, g, h...], I might want to output [[c, d], [g, h], ...]

It seems I could use the event emitter hack again, but I feel like there's got to be a better way. I could apply .skipWhile and .takeWhile to get a chunk, but it's unclear how to compose each such chunk into a higher order stream -- let alone how to switch back and forth based on the stream context. .filter could, with some state, be made to emit only the elements I care about, but they'd be flattened into the output at that point. .scan produces 1+N elements in the output and .reduce produces 1.

I'm sorry if this is a bit of a helpdesk request, but I figure the ask here is clearer documentation or potentially code that better supports such use cases :)

myndzi commented 6 years ago

I... just found .loop. and .unfold. I think these are both close to what I am looking for ¯\(ツ)

I'm not sure these help that much though, since what I really want is not to return a single 'next' value, but to append to a stream of values :\

TylorS commented 6 years ago

@motorcycle/run might be up your alley if you need cyclic streams.

myndzi commented 6 years ago

So I'm pretty much barking up the wrong tree?

TylorS commented 6 years ago

Sorry too be perfectly honest, my phone did not render the 'shrug' emoji properly -- on my laptop now -- in the last message so I misinterpreted you as having already answered the second question. I was then attempting to provide a possible solution to the first question.

In addition to Motorcycle, which is a most.js toolkit, hopefully being a possible solution to your first question, I'd also like to note most-subject also has a function attach which can be used in conjunction with the other APIs to make circular dependencies.

Another factor in my confusion, is that your assumption of loop is right on the money for at least one solid approach to chunking from your second question. We have recipe for what seems to be the goal based on the transformation between the two arrays.

https://github.com/cujojs/most/wiki/Pairwise

I hope this helps and bit more.

myndzi commented 6 years ago

Ah, sorry if I misunderstood. Motorcycle looked unrelated from its readme :) I'll dig into it tomorrow and see what I can see. most-subject might be the shape of what I am looking for for that part; I'm not the most familiar with FRP paradigms though, so I may have been stuck thinking that there's some "proper" way I should be doing this that doesn't involve explicitly pushing to a stream.

The pairwise example is a bit flawed for this case, since it seems to emit 1) static values and 2) overlapping values. That is, for a stream [a, b, c, d, e] it will emit arrays [a, b], [b, c], and so on. I think this could be modified such that I essentially maintain an accumulator that stores up the bits I want and... emits empty values? and then flatten them? but it seemed like there would be a way to define essentially a slice of the parent stream and emit those values.

For a simple example, I might do something like:

tagStream.skipWhile(tag => tag.name !== 'a')
    .skipAfter(tag => tag.type === 'close' && tag.name === 'a')

(this would actually need to account for depth, too)

That seems like one way I could "slice" out a chunk of tags from the stream, and the resultant higher order stream may contain 0 or more such slices. Most of the methods seem to come close but not all the way for this kind of behavior, so I'm just trying to determine if it's me being bad at this style of programming, if it's not really usefully accomplished with this library, or if it's something that could be improved by examples in the docs or simple additions to the API.

I appreciate you taking the time to help me, too :) This library seems pretty awesome, so I'd love to make use of it and get a better handle on these concepts!

briancavalier commented 6 years ago

Hi @myndzi. For non-overlapping chunks, have a look at most-chunksOf. There are tricky bits around the end of the stream that it will handle as well.

On first read, your web link spidering ("cyclical subscriptions") description sounds more like a recursive function + higher order streams, but I'm not sure I understand the goal. Can you help me understand a bit more about it? Or maybe you could put together a simplified representative example that we can use for discussion and hacking? Feel free to ping us in gitter as well for more real-time discussion.

myndzi commented 6 years ago

After searching and digging, it seems like this kind of library is probably not the right pick for this task, at least not at my current level of understanding. Thanks for your responses, I'm sure I'll be back to this as a reference down the road!