mgedmin / objgraph

Visually explore Python object graphs
http://mg.pov.lt/objgraph/
MIT License
771 stars 72 forks source link

API redesign #5

Open mgedmin opened 10 years ago

mgedmin commented 10 years ago

This is a long-term wishlist idea.

The API of objgraph grew organically during manual debugging session. As a result there are plenty of ad-hoc function arguments that need to be passed from function to function. Also, common tasks like "find me a chain of references from a module to this particular object, then display it" need to be spelled in cumbersome ways.

It'd be nice to come up with a better API.

mgedmin commented 10 years ago

Another usecase: debugging an ncurses-based application (which means one can't use print), see #8.

mgedmin commented 9 years ago

My vague dream was to have a Graph object.

>>> g = objgraph.backref_graph(source_node, max_depth=3)
>>> g.show()  # spawn xdot if in $PATH or generate png and spawn a picture viewer
>>> g.save('filename.dot')  # write dot
>>> g.save('filename.png')  # write temporary dot, run graphviz converter to get png
>>> g.as_dot()              # return the .dot as a string

I also wanted to make an interactive graph viewer (based on xdot) that could be manipulated at runtime, e.g. to dynamically hide bits you're not interested in or expand bits you want to explore more. I thought about forking/extending xdot so I could do this by right-clicking graph nodes and having it do a nice animation from the old graph to the new graph (animation since new graph's layout might be radically different from the old layout due to new nodes and edges showing up).

API-wise it might look something like

>>> g.expand('o12345', max_depth=2)  # find node with id() 12345, expand it
>>> g.collapse('o3245') # find node with id(), hide all nodes reachable from it unless they're reachable through alternative paths

This whole idea might be unimplementable, because I'd have to duplicate the existing object graph, and also introduce a bunch of new references.

mgedmin commented 9 years ago

Another idea would be to have a Graph object that doesn't store a graph, but instead computes it on the fly.

The API might be the same: it would cache the generated dot source and drop the cache if you do something like expand/collapse etc. that would change the graph.

mgedmin commented 9 years ago

This whole idea might be unimplementable, because I'd have to duplicate the existing object graph, and also introduce a bunch of new references

Hm, maybe there would be no problems if I don't store references to objects, just object IDs...

(BTW I'm sometimes worried about the reuse of IDs, in case other threads kick in and start freeing/creating them.)

mgedmin commented 9 years ago

Oh and to make this whole exercise harder, I'd like to keep backwards-compatibility.

def show_backrefs(objs, ...):
    g = backref_graph(objs, max_depth, extra_ignore, filter, too_many, highlight, extra_info, refcounts, shortnames)
    if filename:
        g.save(filename)
    elif output:
        output.write(g.as_dot())
    else:
        g.show()
mgedmin commented 9 years ago

I think a Graph should be a generic thing (ohdear, am I overengineering), with a bunch of properties you can set.

>>> g = Graph()
>>> g.edge_func = gc.get_referents  # we're making a forward-ref graph
>>> g.edge_func = gc.get_referrers  # no, we're making a backref graph
>>> g.swap_source_target = True
>>> g.cull_func = is_proper_module
>>> g.max_depth = 3
>>> g.roots = [obj1, obj2, obj3]
>>> g.shortnames = True
>>> g.refcounts = True
>>> g.extra_ignore.add(id(obj4))

I'm tempted to add _func to all parameters that take functions:

>>> g.highlight_func = lambda x: isinstance(x, MySpecialClass)
>>> g.filter_func = lambda x: id(x) in show_only_these
>>> g.extra_info_func = lambda x: hex(id(x))

The constructor would obviously allow setting any of the above.

>>> g = Graph(edge_func=gc.get_referents, max_depth=5)

Normally you wouldn't instantiate a Graph directly but use one of the helpers

>>> g = backref_graph(obj1, max_depth=5)
>>> g = ref_graph(obj2, too_much=20)
>>> g = backref_chain(obj3)  # stops at is_proper_module by default

The __repr__ would say something like

>>> backref_graph(obj1)
<Graph: 243 nodes>
>>> _.show()
mgedmin commented 9 years ago

Caching:

>>> g = backref_graph(obj1)
>>> g.show()
>>> g.save('filename.png')
>>> g.as_dot()

should only do the (expensive) graph traversal once, then reuse it.

>>> g = backref_graph(obj1)
>>> g.show()
>>> g.max_depth += 1
>>> g.show()

should discard the cached graph when it notices I'm changing max_depth (or any of the other parameters).

Perhaps there should be an explicit recompute():

>>> g = backref_graph(obj1)
>>> g.show()
>>> g.recompute()
>>> g.show()

in case we want to see changes made by other threads.

pcostell commented 9 years ago

I like the idea of a graph object, but I think it might help simplifying the API if displaying the graph was distinct from creating it.

In particular: highlight, filename, extra_info, refcounts, shortnames, swap_source_target, and output are all irrelevant to actually creating the graph.

Additionally, I think it could be further simplified by combining filter and extra_ignore (extra_ignore is just a filter to ignore those properties). There could be convenience methods for creating a filter from extra_ignore, but it adds extra complexity to the API itself.

mgedmin commented 9 years ago

I think it could be further simplified by combining filter and extra_ignore

Yes: extra_ignore is just premature optimization. I wanted to avoid the cost of a function call. Besides, I was already using an internal ignore set to avoid the graph traversal machinery polluting the resulting graph, and adding a few extra object IDs to it fell out as a free feature.

atttx123 commented 9 years ago

how about ascii art, like this:

--------      --------
|  list | --> | dict |
--------      --------
mgedmin commented 9 years ago

@atttx123: not on my roadmap.

Aside: I once tried to do ASCII-art based graph layout for adventure games (when I got stuck in Myst 4). It's hard.