fsprojects / FSharp.Data.Adaptive

On-demand adaptive/incremental data for F# https://fsprojects.github.io/FSharp.Data.Adaptive/
MIT License
250 stars 24 forks source link

Discussion: nested adaptive values #93

Closed iamim closed 2 years ago

iamim commented 3 years ago

I previously asked open questions like this one in the Discord channel (thank you all for helping me out :) ) but I feel like placing it here might help other people too.

I'm interested in distributing the results of an incremental computation from the server to the client. To give a concrete example, suppose that the output of a computation is a table with a dynamic number of columns represented by the data structure HashMap<rowkey, HashMap<colkey, double>>. Now to broadcast it to the client I could diff the rows efficiently since the outer map will be an amap and I get efficiently calculated deltas that are a part of computation for free. However, diffing the inner map is more tricky. It feels somewhat wasteful to diff 2 HashMap to get a HashMapDelta. Ideally, these deltas were already computed and I could simply extract them.

How to structure such a computation is also not entirely clear to me. I could not find a solution for how to represent an amap<'a, amap<'b, 'c>>. If I understand it correctly, it's not possible since the only ElementOperations available to us are Set and Remove.

How do you approach such a structure? Going back to the example above, it's clear that column and row indices could be merged into a tuple and everything would be fine. But in general, it may not always be achievable or natural for the problem.

krauthaufen commented 3 years ago

What's the problem with the nested amap approach? The outer map will change whenever a row is addes and the inner one only when cols are added? I don't see the problem why this is bad? Btw you don't have to diff anything when properly feeding your changes into a cmap

krauthaufen commented 3 years ago

Btw these the Set|Remove operations together with old and new state can be used to derive virtually all sorts of changes like Rename|Insert|etc.

I used that in one of our projects so if you want I can try to dig up the code

iamim commented 3 years ago

What's the problem with the nested amap approach? The outer map will change whenever a row is addes and the inner one only when cols are added? I don't see the problem why this is bad? Btw you don't have to diff anything when properly feeding your changes into a cmap

I see what you mean. I think I had this question somewhere in the beginning of my experience with the library. I then noticed that cmap of cmap didn't drigger update if the inner is changed but I guess that's the expected behviour and will work fine as long as the inner adaptive values are accessed with an A variant of functions, such as mapA. I'll have a look again in at my code. Thank you :)

Btw these the Set|Remove operations together with old and new state can be used to derive virtually all sorts of changes like Rename|Insert|etc.

I used that in one of our projects so if you want I can try to dig up the code

I thought about it a little bit too. It would be nice to send one Insert instead of updates to all of the items in the list below the inserted index. Let me try to look at it more when I get time before you spend yours digging up the code. Still not 100% sure how to work with Index.

iamim commented 3 years ago

Hi again,

I did a little bit more thinking about the question. I think it boils down to the serialization of nested adaptive types.

To make it a bit more concrete, I'm currently trying to implement a virtual table similar to that demoed by Jane Street in incr_dom but the virtualization part performed on the backend. In short, instead of transmitting the full table, a slice of it currently observed is transmitted. I modeled such a slice as amap<rowkey, aval<position_idx> * amap<colkey, string>>. I did not go for an alist to easily convey that a row changed its position with just one int being transmitted to the frontend; only changed columns are transmitted.

Unfortunately, it's not easy to leave the world of Adaptive with such a structure. I could (and I did for the time being) squash the nestedness to amap<rowkey, position_idx * HashMap<colkey, string>> but that either requires resend of all values if any has changed for a givenrowkey or manually diffing to find the changes. It's not the end of the world of course, but it feels like a huge waste knowing how wonderful this library is for composing and propagating deltas.

I was wondering if you have experience of doing something similar to that. I remember seeing your demo where a 3D scene state (a sphere with some dots on it) was broadcasted over the wire. Could you point me to something here?

Ultimately, I have a feeling that it could be such a wonderful way of remoting to efficiently get some streaming state. Then, given a decoder, the client could receive updates in the most efficient way. Different strategies (push all vs. push-pull) could also be used based on the bandwidth vs. latency tradeoff with backpressure out of the box. Further, the client could feed the output into its own adaptive computation graph, effectively staging a given adaptive computation over several nodes.

It may be that something akin to FsPickler could be useful here if it could understand Adaptive and construct an instance of AdaptiveReader depending on the provided (nested) AdaptiveObject instance. (I'm not sure I'm correct here, please feel free to correct; also, I'm familiar with FsPickler and pickling/pickling combinators on the very surface).