Ericsson / CodeCompass

CodeCompass is a software comprehension tool for large scale software written in C/C++ and Java
https://codecompass.net
GNU General Public License v3.0
516 stars 101 forks source link

Incremental parsing fails with segmentation fault #735

Closed dbukki closed 4 months ago

dbukki commented 5 months ago

As the name suggests, parsing anything in incremental mode (without a full re-parse) crashes CodeCompass with a segmentation fault. The issue was first signaled in https://github.com/Ericsson/CodeCompass/pull/714#issuecomment-2078781513 but was found to be present in the current master (https://github.com/Ericsson/CodeCompass/commit/8e84d84e29a0cec6cb0af9f6dcc587ea9ff34480) as well.

dbukki commented 5 months ago

I have identified two major issues that contribute to the failure of incremental parsing:

[1] The segmentation fault comes from CppMetricsParser::CppMetricsParser during plugin creation: https://github.com/Ericsson/CodeCompass/blob/8e84d84e29a0cec6cb0af9f6dcc587ea9ff34480/plugins/cpp_metrics/parser/src/cppmetricsparser.cpp#L47

node->location.file has not been not loaded with .load(). A read attempt is made on the ID of a non-loaded File object.

Conclusion: This query should be refactored into a view that uses JOINs to achieve the same result without the need for a query_one for each AST node and .load() for each file.

[2] Once that's fixed, a second problem arises in SourceManager::removeFile during the global cleanup phase: https://github.com/Ericsson/CodeCompass/blob/8e84d84e29a0cec6cb0af9f6dcc587ea9ff34480/parser/src/sourcemanager.cpp#L243

.size() is being called on the results of an ODB query that has not been cached prior to this via .cache(). (See https://www.codesynthesis.com/products/odb/doc/manual.xhtml#4.4 )

Even worse yet, even if we did call .cache(), .size() would still always throw odb::result_not_cached when we use SQLite for parsing. I actually ran into this limitation very recently in https://github.com/Ericsson/CodeCompass/pull/734/files#diff-24d453f78f735d12a81ddf3aa0350be52c74b9d362c63a3c3532e5ca6ad4e6dfR148 . (See https://www.codesynthesis.com/products/odb/doc/manual.xhtml#18.5.1 )

Conclusion: We should eliminate all .size() and .cache() calls on ODB query results from CodeCompass. For as long as we support SQLite, it's a potential source of exceptions.

mcserep commented 5 months ago

I have verified that the bug is present on the master branch, but do not occur with the --skip cxxmetricsparser flag. Also, the release/gershwin branch is not affected by the bug, which is prior to the the introduction of the C++ metrics plugin.

mcserep commented 5 months ago

I have identified two major issues that contribute to the failure of incremental parsing:

[1] The segmentation fault comes from CppMetricsParser::CppMetricsParser during plugin creation:

https://github.com/Ericsson/CodeCompass/blob/8e84d84e29a0cec6cb0af9f6dcc587ea9ff34480/plugins/cpp_metrics/parser/src/cppmetricsparser.cpp#L47

node->location.file has not been not loaded with .load(). A read attempt is made on the ID of a non-loaded File object.

Conclusion: This query should be refactored into a view that uses JOINs to achieve the same result without the need for a query_one for each AST node and .load() for each file.

@dbukki Nice catch :clap:

[2] Once that's fixed, a second problem arises in SourceManager::removeFile during the global cleanup phase:

https://github.com/Ericsson/CodeCompass/blob/8e84d84e29a0cec6cb0af9f6dcc587ea9ff34480/parser/src/sourcemanager.cpp#L243

.size() is being called on the results of an ODB query that has not been cached prior to this via .cache(). (See https://www.codesynthesis.com/products/odb/doc/manual.xhtml#4.4 )

Even worse yet, even if we did call .cache(), .size() would still always throw odb::result_not_cached when we use SQLite for parsing. I actually ran into this limitation very recently in https://github.com/Ericsson/CodeCompass/pull/734/files#diff-24d453f78f735d12a81ddf3aa0350be52c74b9d362c63a3c3532e5ca6ad4e6dfR148 . (See https://www.codesynthesis.com/products/odb/doc/manual.xhtml#18.5.1 )

Conclusion: We should eliminate all .size() and .cache() calls on ODB query results from CodeCompass. For as long as we support SQLite, it's a potential source of exceptions.

@dbukki We should not optimize for SQLite support, as that is only for development purposes. It was also considered multiple times during the project's lifetime to completly drop SQLite support. If incremental parsing is conflicting with SQLite, then we can make incremental parsing not supporting it.