[Feature request] Visualization of dependency graph

MilesCranmer commented 2 years ago

The main advantage of using FromFile.jl is that you have an explicit dependency structure.

I think it would be awesome if there was a way you could visualize this - and show which files import from which other files. I think the normal Julia import structure makes this sort of thing really hard/impossible, so this would be a good tool to stress the advantage of FromFile.jl!

This tool looks to be useful for such a purpose: https://github.com/caseykneale/Sherlock.jl - maybe it would already work, since each file appears as a module?

patrick-kidger commented 2 years ago

Yep, agreed that this sounds cool. Not 100% sure if it's in-scope here given the original intention that this serve as a demonstration of a hopefully-eventually future syntax for Julia itself. (Admittedly this now looks relatively unlikely; the rest of the community seems to prefer messing around with raw includes...)

This isn't something I'm likely to implement myself, but if someone implements it then we could either merge it in, or release it as a standalone package. I quite like the latter idea actually, if only to show the community that there is interest in this kind of thing!

MilesCranmer commented 2 years ago

Here is a bash command to generate this with mermaid-js - which can be rendered in GitHub markdown. This simple snippet assumes a flat file tree, and you run it from src/:

echo 'stateDiagram-v2'
IFS=$'\n'
for f in *.jl; do
    for line in $(cat $f | grep -E '^@from '); do
        echo $(echo $line | sed 's/@from "//' | sed 's/.jl" import.*//') $(basename "$f" .jl)
    done
done | sed 's/ / --> /' | sort

Here is the dependency tree for SymbolicRegression.jl:

stateDiagram-v2
CheckConstraints --> Mutate
CheckConstraints --> SimplifyEquation
CheckConstraints --> SymbolicRegression
ConstantOptimization --> SingleIteration
Core --> CheckConstraints
Core --> ConstantOptimization
Core --> EquationUtils
Core --> EvaluateEquation
Core --> EvaluateEquationDerivative
Core --> HallOfFame
Core --> InterfaceSymbolicUtils
Core --> LossFunctions
Core --> Mutate
Core --> MutationFunctions
Core --> PopMember
Core --> Population
Core --> Recorder
Core --> RegularizedEvolution
Core --> SimplifyEquation
Core --> SingleIteration
Core --> SymbolicRegression
Core --> Utils
Dataset --> Core
Equation --> Core
Equation --> Options
EquationUtils --> CheckConstraints
EquationUtils --> ConstantOptimization
EquationUtils --> EvaluateEquation
EquationUtils --> EvaluateEquationDerivative
EquationUtils --> HallOfFame
EquationUtils --> LossFunctions
EquationUtils --> Mutate
EquationUtils --> MutationFunctions
EquationUtils --> Population
EquationUtils --> SingleIteration
EquationUtils --> SymbolicRegression
EvaluateEquation --> EvaluateEquationDerivative
EvaluateEquation --> LossFunctions
EvaluateEquation --> SymbolicRegression
EvaluateEquationDerivative --> SymbolicRegression
HallOfFame --> SingleIteration
HallOfFame --> SymbolicRegression
InterfaceSymbolicUtils --> SymbolicRegression
LossFunctions --> ConstantOptimization
LossFunctions --> HallOfFame
LossFunctions --> Mutate
LossFunctions --> PopMember
LossFunctions --> Population
LossFunctions --> SymbolicRegression
Mutate --> RegularizedEvolution
MutationFunctions --> Mutate
MutationFunctions --> Population
MutationFunctions --> SymbolicRegression
Operators --> Core
Operators --> Options
Options --> Core
OptionsStruct --> Core
OptionsStruct --> Equation
OptionsStruct --> Options
PopMember --> ConstantOptimization
PopMember --> HallOfFame
PopMember --> Mutate
PopMember --> Population
PopMember --> RegularizedEvolution
PopMember --> SingleIteration
PopMember --> SymbolicRegression
Population --> RegularizedEvolution
Population --> SingleIteration
Population --> SymbolicRegression
ProgramConstants --> Core
ProgramConstants --> Dataset
ProgramConstants --> Equation
ProgressBars --> SymbolicRegression
Recorder --> Mutate
Recorder --> RegularizedEvolution
Recorder --> SingleIteration
Recorder --> SymbolicRegression
RegularizedEvolution --> SingleIteration
SimplifyEquation --> Mutate
SimplifyEquation --> SingleIteration
SimplifyEquation --> SymbolicRegression
SingleIteration --> SymbolicRegression
Utils --> ConstantOptimization
Utils --> EvaluateEquation
Utils --> EvaluateEquationDerivative
Utils --> InterfaceSymbolicUtils
Utils --> PopMember
Utils --> SimplifyEquation
Utils --> SingleIteration
Utils --> SymbolicRegression

which you hopefully see rendered as a directed graph! (This would be extremely difficult to parse from the regular include() hell)

patrick-kidger commented 2 years ago

Oh that's pretty cool! I like that.

(For compleness, this doesn't render in the GitHub app though 😀 - I had to switch to the web browser.)

danielsoutar commented 2 years ago

(This would be extremely difficult to parse from the regular include() hell)

I was doing this manually myself the other day for my not-flat codebase and thought the same thing - FromFile makes this possible and include makes it an incredible pain. I think it would help a lot, particularly with codebases like mine where the developers were still figuring out how to do 'Julia best practices' when they first wrote it. I'd also like to double-check that I didn't make a mistake with my manual sweep 😅.

I'd be happy to give this a try as a separate package! Maybe named FromGraph?

danielsoutar commented 2 years ago

So before I embark on this, I'll write a problem statement so we're on the same page. Sorry for the dump! Just don't want to write something nobody is interested in/something incorrect.

FromGraph would not install Graphviz by default - this would be installed separately by the user. GraphViz.jl is a package but looks pretty dead, unreliable, and I don't actually need it in order to generate a graph representation. Also, you might have your own preference on how you display the graphs, maybe in an editor or in a browser. A nice-to-have, but not required for a V1 package.

The dependencies that FromGraph works with would be modules, not files. As a starting point I'd make the following assumptions (feel free to correct me if I missed something):

The top-level file, or the entry-point into the Julia project, does not have to be a module.
The project does not have to be a package.
A file can declare at most one module. It must fully enclose the file's contents (whitespace/comments outside are permitted). Otherwise the file is either included (see next assumption) or is a top-level script.
A module must start and end in the same file, but is allowed to include sub-files that are substituted into the module (effectively it is just one big module after substitutions). These included files would not be tracked as independent modules - any usage of @from inside sub-files is aggregated into their enclosing module. I would not track sub-modules.
- As a corollary, this means a project only using includes would collapse into a single vertex. That is a feature, not a bug!
A file need not be named the same as its module (if any).
FromGraph would be given the root directory of a project (to verify that it is working with a Julia project), as well as the file to start from. Maybe I could infer one from the other, but I'll avoid assuming.

FromGraph would then yield a graph of the connections between modules, that can either be converted to a string in the DOT language, or a table of inward- and outward- connections for each module. Ideally you'd also have views, like a view of the dependency graph for just a particular module, or limiting the order of neighbours from a given module.

Think this covers everything - I am allowing restricted usage of includes since my codebase at work uses them in that way and I'd like to benefit from this tool as well.

MilesCranmer commented 2 years ago

Sounds good to me. For export I would be in favor of outputting it in a user-specified format that could be visualized by either graphviz or mermaid.js. They are very nearly the same, though it looks like mermaid.js might use --> whereas graphviz uses ->: https://mermaid-js.github.io/mermaid/#/flowchart - so maybe a keyword could control that.

MilesCranmer commented 2 years ago

Also, Documenter.jl seems to be open to having mermaid.js support, so you could potentially have dependency graphs be auto-generated inside documentation.

Edit: oh, looks like they already have it working: https://github.com/JuliaDocs/Documenter.jl/pull/1648#issuecomment-1264522384

Roger-luo / FromFile.jl

[Feature request] Visualization of dependency graph #37