Open alexbeattie42 opened 1 month ago
Hi @alexbeattie42, thanks for posting this issue. I agree that the dependency tree is quite complicated and no doubt OpenSim has incurred lots of technical debt over the years. I would like to eventually explore better solutions, but this would likely be a large development undertaking (as you suggest).
Happy to start a discussion here or somewhere else if that's more convenient.
I think that there are many paths that could be taken to help move things in a positive direction. The approaches I've found most successful in the past have involved selecting specific sub-components which have a very broad reach (like "SimTKCommon") and creating small actionable tasks for the sub-system.
For example, a goal could be to make the dependencies strictly hierarchical (which has many benefits). This can then be further broken down into analyzing what links currently exist (as shown by the tree) and determining the amount of effort required to sever the non-hierarchical links. Sometimes it is trivial and sometimes it is very difficult. Then the refactoring code can be written, tested, and merged.
I've also done a similar analysis and dependency graph for the code base itself (shown below), although it is very difficult to derive useful information from it in its current state. When I have some more time, I can look at some different strategies for reducing the amount of information (maybe only looking at sub-components). The similarly colored components are strongly connected (meaning they are in a circular dependency loop). Breaking these loops would yield many benefits. Circular dependencies can significantly increase compile and rebuild time so one immediate benefit would be that builds would be much faster and more predictable.
I'm happy to help work on some "chunks" if we can identify areas or sub-modules that would be suitable candidates to begin refactoring.
@alexbeattie42 Nice job putting our code base through the tools and publishing the graph. If possible it would be great to remove the executables including the examples out of the graph as these are more of use cases rather than part of the API/library. Hopefully with that reduced graph we can better see the dependencies and at least list these circular dependencies to start.
This is what the dependencies look like with -DBUILD_EXAMPLES=off
. I am also building at the moment with Intel MKL and all the BLAS and OpenMP libs have been swapped out to their Intel versions
I have created a repository with my analysis code and results.
I have begun analyzing the OpenSim/Common
folder of the project. The following is what the dependencies (#include
statements) looks like. All components in green are strongly connected (cyclic).
This is the transitive reduction of the graph to isolate the shortest loops:
Exception -> Object -> Exception
IO -> Logger -> IO
RegisterTypes_osimCommon -> osimCommonDLL -> RegisterTypes_osimCommon
Storage -> GCVSplineSet -> Storage
Storage -> TableUtilities -> Storage
SimmSpline -> XYFunctionInterface -> SimmSpline
GCVSpline -> XYFunctionInterface -> GCVSpline
PiecewiseLinearFunction -> XYFunctionInterface -> PiecewiseLinearFunction
XYFunctionInterface -> PiecewiseConstantFunction -> XYFunctionInterface
Adapters -> DataAdapter -> Adapters
Component -> ComponentList -> Component
Component -> ComponentOutput -> Component
Component -> ComponentSocket -> Component
Property -> Object -> Property
XMLDocument -> Object -> XMLDocument
STOFileAdapter -> DelimFileAdapter -> FileAdapter -> STOFileAdapter
Property_Deprecated -> AbstractProperty -> Object -> Property_Deprecated
I have also calculated a full list of all cycles but it is quite lengthy.
I manually verified the IO -> Logger -> IO
loop by checking that IO.cpp
includes Logger.h
and Logger.cpp
includes IO.h
.
I also manually verified the Storage -> GCVSplineSet -> Storage
loop by checking that Storage.cpp
includes GCVSplineSet.h
and GCVSplineSet.cpp
includes Storage.h
.
This seems like a much more manageable place to start than the full project graph. I can also create the same types of visualizations for other sub-folders and have thought of looking at Tools
or Simulation
next.
The current full dependency graph (with Custom Targets) for the OpenSim Core CMake Project (using CMakeGraphViz) looks like this: Dependency Graph (without Custom Targets):
These complex interconnections make it very difficult for someone who is not intimately familiar with the project to make meaningful contributions. It also introduces difficulty in fully understanding what is happening during a given simulation which is critical in creating simulations that accurately and precisely model the target system.
I understand the difficulties of managing complex codebases and know that change isn't possible overnight. I hope this can facilitate a discussion on what changes could be enacted to the project structure that would improve and streamline the developer experience.
[EDIT:10/9/24]: Updated to include a more comprehensive graph