What should Joe work on?

jdhenke commented 11 years ago

After some thought, here are my semi-filtered ideas.

DISCLAIMER: Much of this stuff I recognize as not great, but perhaps it will inspire myself/you guys now or later, so I put it all down.

Static Stuff

Graph Visualization

Here are the options as I seem them.

Display via text
Display via GJS's recommended propagator code
Display via Pavel's recommended dot library
Display via D3 (this uses MIT license I believe)
Further Aggregation of Code

If we could group the code base manually or automatically by file or some other heuristic, could we find the dependencies between groups?

Could we detect conflicts between packages?

Heuristics

Moving into the analysis more, could we identify poor practices via some divined heuristics? *Very open ended and hard to say "OK, I'm done"

Simplify Graph

Much of the concern with displaying a graph is the complexity. Perhaps I could work on simplifying the graph display. For instance, I could work on displaying one function the relevant neighbors. One hop? Two hop?

Type Checking

Could build on Pavel's code, so wouldn't be starting from scratch.

Provenance Tracking

This would be a very large undertaking but could be very cool. Not well defined at this point, as in many design decisions would need to be made, but think of it as tracking flow of information from variables. For instance, annotate some external primitives before, run our code, and see what data was touched by the flow of information from what. Limiting the sources in this graph I think would be good.

Will require many changes at all levels most likely, but relying on well defined interfaces, I think I could work on it now.

Dynamic Stuff

Jumping into runtime, I had some ideas.

Log Inputs and Outputs

accrue input and output calls
construct actual traversal in graph
log all input and outputs

Create interface to interact with all the above?

Documentation

Seems lame, but will have to be done at some point. Do you think we'll need to document each subsystem separately? Sussman might like that.. I could develop a quick API doc for my graph stuff. I could also develop it for @oderby cfg API as well. Outline for out entire system?

Logistical note: I'd be very happy to use Github's Wiki functionality for documentation purposes. If you hadn't noticed, GFM isn't my least favorite thing in the world.

oderby commented 11 years ago

My immediate/gut response is that the dynamic aspect seems the next logical step if we're happy with how our static cfg analysis tool is shaping up and we're happy to just hack visuals. It could be as intimate as the statically constructed cfg is passed along with the code to an eval/apply otherwise-normal interpreter, and edges are added to the cfg corresponding to actual calls. And could maybe transition further into profiling/detecting unreachable code?

Otherwise, I don't think I understand how/what "further aggregation of code" or "heuristics" would be useful/add anything substantial - they seem a little tangential? simplifying the graph seems like a good part of visualization, if we were to do that. type checking could be cool, but might require as much changes as provenance tracking? With regards to inferring types? Or were you assuming the code declared types?

I think we need to do documentation regardless, and might as well wait until our code/project is a bit more stable.

I vote for dynamic extension, type checking, or visualization. in that order ;)

jdhenke commented 11 years ago

The dynamic extension sounds good to me.

I see two different aspects of the problem.

Modifying the apply part of our eval/apply loop to capture this information
- This seems very close to what @dsherry is doing.
Creating the infrastructure to manage the data presuming it is able to be captured
- This seems closer to what @oderby and I have been working on, and is what I would prefer to do.

Does this distinction make sense to you guys? How do you feel about be me tackling the second part?

oderby commented 11 years ago

That sounds fine to me - I'd say tack a stab at defining the second part and running with it, then we can see where you end up.

jdhenke commented 11 years ago

Cool. On it.

jdhenke / introspect