glato / emerge

Emerge is a browser-based interactive codebase and dependency visualization tool for many different programming languages. It supports some basic code quality and graph metrics and provides a simple and intuitive way to explore and analyze a codebase by using graph structures.
MIT License
818 stars 49 forks source link

Stuck in: "analysis I 👉 starting file result creation in cpp check", bombs after 5 hours when the output directory is unavailable #38

Open bruce-optibrium opened 1 year ago

bruce-optibrium commented 1 year ago

Describe the bug

Analysis is taking a long time with no indication of progress.

emerge -a CPP edited paths in cpp-template.yml

emerge -c cpp-template.yml ... 2023-02-21 17:12:41 analysis I ✅ created the filesystem graph 2023-02-21 17:12:41 analysis I 👉 starting file result creation in cpp check ... came back to check 4 hours later no progress reported. A python process is still running using 1Gb RAM Overall CPU usage is at 12%, MEM usage at 68% (27Gb of 40Gb)

Describe your environment

emerge-viz installed using pip

The code base is on the medium to large side (14055 fliles of which 7674 are headers, 65440 lines). I should probably have tried emerge out on something smaller first and read the manual to see if there are any slow analysis steps worth switching off on a first pass.

bruce-optibrium commented 1 year ago

Actually I was impatient:

2023-02-21 21:27:40 analysis I ✅ scanning complete 2023-02-21 21:27:40 analysis I 👉 starting code metric calculation for analysis cpp check 2023-02-21 21:27:40 analysis I ⏩ calculating code metric results for: number of methods metric 2023-02-21 21:27:56 analysis I ⏩ calculating code metric results for: source lines of code metric 2023-02-21 21:28:01 analysis I ⏩ calculating code metric results for: tfidf metric 2023-02-21 21:29:34 analysis I ✅ done calculating code metric results 2023-02-21 21:29:35 analysis I 👉 starting graph metric calculation for analysis cpp check 2023-02-21 21:29:35 analysis I ⏩ calculating graph metric results for: louvain modularity metric 2023-02-21 21:30:29 analysis I ⏩ calculating graph metric results for: fan in out metric 2023-02-21 21:30:29 analysis I ✅ done calculating graph metric results 2023-02-21 21:30:29 analysis E ❗ export directory not found/ accessible

but at the end it fell over. I assumed it would create the output directory itself. Rather than waiting nearly 5 hours and then bombing out. It would be better to check for existence of the directory at the start of the analysis instead!

bruce-optibrium commented 1 year ago

It seems the reason for my slow analysis is because I attempted to analyse a build rather than just source code. The program is slow doing something with the build artifiacts. I had wrongly assumed it would just process source files. It would be useful to know which artifacts are problematic and how to avoid this.

2023-02-21 22:44:49 analysis I ✅ created the filesystem graph 2023-02-21 22:44:49 analysis I 👉 starting file result creation in cpp check 2023-02-21 23:28:37 analysis I ✅ scanning complete

bruce-optibrium commented 1 year ago

I now face another issue:

2023-02-21 23:28:37 analysis I 👉 starting code metric calculation for analysis cpp check 2023-02-21 23:28:37 analysis I ⏩ calculating code metric results for: number of methods metric 2023-02-21 23:28:47 analysis I ⏩ calculating code metric results for: source lines of code metric 2023-02-21 23:28:48 analysis I ⏩ calculating code metric results for: tfidf metric 2023-02-21 23:29:13 analysis I ✅ done calculating code metric results 2023-02-21 23:29:13 analysis I 👉 starting graph metric calculation for analysis cpp check 2023-02-21 23:29:13 analysis I ⏩ calculating graph metric results for: louvain modularity metric 2023-02-21 23:29:37 analysis I ⏩ calculating graph metric results for: fan in out metric 2023-02-21 23:29:37 analysis I ✅ done calculating graph metric results Traceback (most recent call last): File "C:\Users\BruceAdams\AppData\Roaming\Python\Python39\site-packages\networkx\drawing\nx_agraph.py", line 133, in to_agraph import pygraphviz File "C:\Users\BruceAdams\AppData\Roaming\Python\Python39\site-packages\pygraphviz__init__.py", line 24, in from .agraph import AGraph, Node, Edge, Attribute, ItemAttribute, DotError File "C:\Users\BruceAdams\AppData\Roaming\Python\Python39\site-packages\pygraphviz\agraph.py", line 15, in from . import graphviz as gv File "C:\Users\BruceAdams\AppData\Roaming\Python\Python39\site-packages\pygraphviz\graphviz.py", line 13, in from . import _graphviz

ImportError: DLL load failed while importing _graphviz: The specified module could not be found.

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "C:\Program Files\Python39\lib\runpy.py", line 197, in _run_module_as_main
    return _run_code(code, main_globals, None,
  File "C:\Program Files\Python39\lib\runpy.py", line 87, in _run_code
    exec(code, run_globals)
  File "C:\Users\BruceAdams\AppData\Roaming\Python\Python39\Scripts\emerge.exe\__main__.py", line 7, in <module>
  File "C:\Users\BruceAdams\AppData\Roaming\Python\Python39\site-packages\emerge\main.py", line 13, in run
    emerge.start()
  File "C:\Users\BruceAdams\AppData\Roaming\Python\Python39\site-packages\emerge\appear.py", line 91, in start
    self.start_analyzing()
  File "C:\Users\BruceAdams\AppData\Roaming\Python\Python39\site-packages\emerge\appear.py", line 112, in start_analyzing
    analyzer.start_analyzing()
  File "C:\Users\BruceAdams\AppData\Roaming\Python\Python39\site-packages\emerge\analyzer.py", line 51, in start_analyzing
    self.start_scanning(analysis)
  File "C:\Users\BruceAdams\AppData\Roaming\Python\Python39\site-packages\emerge\analyzer.py", line 103, in start_scanning
    analysis.export()
  File "C:\Users\BruceAdams\AppData\Roaming\Python\Python39\site-packages\emerge\analysis.py", line 282, in export
    DOTExporter.export_graph_as_dot(representation.digraph,
  File "C:\Users\BruceAdams\AppData\Roaming\Python\Python39\site-packages\emerge\export.py", line 246, in export_graph_as_dot
    write_dot(graph, export_dir + '/' + 'emerge-' + export_name + '.dot')
  File "C:\Users\BruceAdams\AppData\Roaming\Python\Python39\site-packages\networkx\drawing\nx_agraph.py", line 194, in write_dot
    A = to_agraph(G)
  File "C:\Users\BruceAdams\AppData\Roaming\Python\Python39\site-packages\networkx\drawing\nx_agraph.py", line 135, in to_agraph
    raise ImportError(
ImportError: requires pygraphviz http://pygraphviz.github.io/

but pygraphviz is already installed:

>pip install pygraphviz
Defaulting to user installation because normal site-packages is not writeable
Requirement already satisfied: pygraphviz in c:\users\bruceadams\appdata\roaming\python\python39\site-packages (1.10)
glato commented 1 year ago

@bruce-optibrium Thank you for your feedback. Emerge does provide the option the activate debug logging in the config file, this could help to identify possible issues or better track progress. You could also use some logging command line args e.g. (-v or --verbose for verbose logging) or (-d --debug for even more debug logging) as mentioned in the README.

Your idea "It would be better to check for existence of the directory at the start of the analysis instead!" is great, will plan and include this in an upcoming fix/release 👍.

Not quite sure why the import of graphviz fails in windows. Can you try to remove the dot export config option, i.e. this here:

export:
  - dot

as a workaround, to check if the analysis will run without generating the graphviz output? Maybe try to run it first on a smaller codebase or simple on a subdirectory to reduce running time for this first check?

Hope this will help you.