Allow exporting of the DAG datastructure to some graphing program

kneasle commented 3 years ago

This would allow for better visualisation and debugging of the inner workings of the editable_tree::DAG data structure.

It would also be good to help new developers understand how Sapling stores nodes.

stokhos commented 3 years ago

I'm thinking of working on this one. Do you know any good rust graph visualization lib?

kneasle commented 3 years ago

I was planning on exporting to the dot file format (I used it for the tree visualisation in the README - see the input file and the output image).

It's basically a file where you specify relations between tree nodes to form a DAG, and it renders it for you. Like:

digraph DAG {
    root1 -> child1, child2, child3;
    root2 -> child1, child4, child3;
}

would make a DAG where child1 and child3 are shared between the two roots.

You have to give each node a unique name (which is a pain), but you should be able to specify non-unique display names. Then you can just call all the nodes node1, node2, node3, etc. and specify their display names to be true, array/[], etc.

stokhos commented 3 years ago

OK, I will start working on this one after you big refactoration.

kneasle commented 3 years ago

OK, I will start working on this one after you big refactoration.

I think it'll be fine to do this whilst I'm refactoring - we might get a few merge conflicts but they'll be fairly minimal and easily resolved (just don't dig too deep into the ast code, cos it's about to change :wink:).

stokhos commented 3 years ago

Initial tree

[
true,
false,
["str", "str", {"key1":"value1", "key2":"value2"}, true]
]

After edit

[
true,
false,
["str", "str", {"key1":"value1", "key2":"true"}, true]
]

You think this is ok?

I feel after some edits on the original tree, the dag graph will super complicated.

kneasle commented 3 years ago

You think this is ok?

Heck yes! This looks awesome! It would be even cooler if you could also label the edges with their sibling index (since indices matter and dot chooses which the order of the nodes arbitrarily). See this question on stackoverflow.

I guess this is generated by hand? If it isn't, then I think fields should be their own nodes, since they could have quite big trees as their children. If it is made by hand, then this should just happen when the code is written.

I feel after some edits on the original tree, the dag graph will super complicated.

Yeah, I also think this will be the case. But any way to see visually how the code is working is infinitely better than nothing. Regardless, it will be good for wowing people with :grin:.

stokhos commented 3 years ago

I guess this is generated by hand? If it isn't, then I think fields should be their own nodes, since they could have quite big trees as their children. If it is made by hand, then this should just happen when the code is written.

Yes this was by hand. to help me understand how sapling updates the tree. I have to say this is very helpful.

stokhos commented 3 years ago

Is there away to access items in arena by index? I can draw each tree in root_history separately, but I can't combine them into a DAG.

For two Json::True from different two Dag history, I can' tell if they ashould be one node shared by two trees in dot graph or they are two nodes just of same type.

digraph G {
root1 -> True, Null,
root2 -> True, True,}

The first True root1 and root2 is one node that is shared by the both trees. The second True in root2 is a different node, but in current implementation, given a node from a tree, i feel it's hard to define a 1 to 1 relationship with node in dot graph. This is why can't connect ttrees in to a dag graph.

I tried to add an id in Item

#[derive(Debug, Clone)]
pub struct Item<T> {
    node: T,
    id: usize,
}

But this approach would result the change in arena.alloc:

    pub fn alloc(&self, node: T) -> &Item<T>{
        let id = self.base_arena.len();
        &self.base_arena.alloc(Item::new(node, id))
    }

I'd like to discuss this with you before I invest too much time on this.

kneasle commented 3 years ago

Two nodes are 'the same' if they have the same address in memory, and since we only ever reference nodes through &'arena Node we can use std::ptr::eq to test whether two nodes are 'the same'.

Also, you can tell dot to give a given node a specific (non-unique) label, like:

digraph g {
    root1 [label = "array"];
    root2 [label = "array"];
    true1 [label = "true"];
    true2 [label = "true"];
    null [label = "null"];

    root1 -> true1, null;
    root2 -> true1, true2;
}

What I'd probably do in this situation is to name all the dot nodes by the memory address of their Ast equivalent (you can convert a pointer to a usize representing the address with node as *const Node as usize) - so a node at address 142356 would be called node142356. Then the file would start with a load of label definitions to give names to the memory addresses (like node142356 [label = "array"];), and then a big list of the nodes and their children (like node142356 -> node314662, node456123;). I don't know if this strategy is the best one, but I think it'd probably work...

What do you think?

stokhos commented 3 years ago

I think this is what I was looking for.

kneasle commented 3 years ago

Cool! It's a bit of a convoluted way of doing it, but it should still work even if we change how nodes are stored in memory.

stokhos commented 3 years ago

How can I read .json file to sapling? Tried with a json file, but I got an error saying: not implemented.

kneasle commented 3 years ago

Hmmm did your JSON file have numbers in it? Cos Sapling currently can't load numbers and so it panics when parsing them.

stokhos commented 3 years ago

Yes. There is a number in it. I have changed to str

stokhos commented 3 years ago

How are we going to use this function? After each edit, function will be called once, or it will be only called before quit?
I have made it a method in Dag, and currently the digraph is sent to log file.

kneasle commented 3 years ago

I was thinking that once command mode is implemented (which you're welcome to do by the way), we could have a command like :dag-export <file>, which would dump the dot code to a specified file.

I don't think we should try to compile it to a png, because that would require adding dot as a dependency for a feature that is only likely to be used for debugging.

stokhos commented 3 years ago

I'd like to implement the command mode.

kneasle commented 3 years ago

Cool! I'll make an issue this evening and assign you to it.

stokhos commented 3 years ago

Can you close this issue, it has been fixed in PR #87.

kneasle commented 3 years ago

Oh snap! I hadn't realised that this was still open. I'll close now.

Also sorry if I'm slow replying to things - I've been having to backburner Sapling a little bit because I need to work on other projects. I'll still review stuff and I haven't forgotten about you or Sapling :grin:. Just thought I'd give you a heads-up.

stokhos commented 3 years ago

Oh snap! I hadn't realised that this was still open. I'll close now.

Also sorry if I'm slow replying to things - I've been having to backburner Sapling a little bit because I need to work on other projects. I'll still review stuff and I haven't forgotten about you or Sapling grin. Just thought I'd give you a heads-up.

No worries! I have been super busy in the past weeks. I'm also need to working on my dissertation. But I will contribute when I have time.

kneasle commented 3 years ago

No worries! I have been super busy in the past weeks. I'm also need to working on my dissertation. But I will contribute when I have time.

Aha sounds like the timings have worked out pretty well! What's your dissertation on?

stokhos commented 3 years ago

My field of interest is monetary economics, cryptocurrency, and high frequency trading.

kneasle commented 3 years ago

Oh wow that's pretty cool... so if all goes well then you'll soon be Dr Stokhos then :P.

stokhos commented 3 years ago

Yes. I will get my degree really soon, I forgot to mention that my dissertation topic is arbitrage in cryptocurrency market. The other one is portfolio selection and deep learning. What about you?

kneasle commented 3 years ago

Whoa that sounds super complicated :sweat_smile:! I'm a mere undergrad, so I haven't written any papers at all (yet..?). I'm not sure about whether or not to do a PhD - thing is I feel covid cheated me out of a year of my student experience, so I'd quite like to get it back :laughing:.

stokhos commented 3 years ago

My suggestion would be go to graduate school. Work under David Silver or Yarin Gal!!!!

kneasle / sapling

Allow exporting of the DAG datastructure to some graphing program #13