Ericsson / CodeCompass

CodeCompass is a software comprehension tool for large scale software written in C/C++ and Java
https://codecompass.net
GNU General Public License v3.0
520 stars 102 forks source link

Non-existent files are resolved as the working directory file #573

Open andocz opened 2 years ago

andocz commented 2 years ago

AST nodes that aren't located in any file are reported in the database as being located inside the working directory file.

As a result, the directory that the parser was called from appears in the file manager in the web UI.

This happens because cc::parser::FileLocUtil::getFilePath(const clang::SourceLocation&) returns an empty string if no file containing the SourceLocation can be found. This "path" is directly passed to cc::parser::SourceManager::getFile(const sdt::string&) in numerous places in the code. This function resolves it against the working directory, returning the same directory.

My planned solution is to make SourceManager::getFile return nullptr if a relative path (which an empty string is) is passed as the argument. This means the location_file field for these nodes will be null in the database. Does that sound like a good format?

I wanted to discuss this before implementing a fix as it requires a (small) change to the database schema.

mcserep commented 2 years ago

@andocz Can you gives us a MWE to test and validate this issue on? (An AST node without location?) Where did you find this unintended behaviour?

andocz commented 2 years ago

TinyXML2 is one example. I parsed it by invoking CodeCompass_parser while cd'd to CodeCompass's bin directory, and these nodes had the bin directory as their location:

But any C++ code should do the trick. The key is to invoke the parser from a directory unrelated to the project being parsed.

mcserep commented 2 years ago

@andocz Thanks for the quick response, we will reproduce this first.

However I don't think that the correct behaviour would be to add null value for the location_file field for these records, only if the related astValue really belongs to no file. On the screenshot you shared this is not the case for many occurrences, e.g. the main(int, const char **)::TestUtil::~TestUtil() should belong to a file, I guess to the one containing the main function.

So the source if this faulty behaviour should be examined first.

ervin7mk commented 1 year ago

I've succesfully reproduced the issue brought up by @andocz . I created a directory unrelated to the project named ParseFromHere and I invoked the parser while inside this directory. As seen in the attached screenshots, there are nodes related to this directory. ParseFromHere_found_in_File ParseFromHere_has_nodes However, there are also CodeCompass related files in the "File" table whose situation I am not sure about. Should they also appear or not? CodeCompass_in_file

ervin7mk commented 1 year ago

Hi. As we previously discussed on our CodeCompass meeting, I followed the approach @andocz presented and I found that this didn't solve the issue. This code snippet is really short but you can find it at the following link: https://github.com/ervin7mk/CodeCompass/commit/203f59b4eae8e8e63f90206949e7bc8554d8f99e Is there an error I'm commiting or maybe the problem is solvable at a larger scale?

mcserep commented 1 year ago

Hi @ervin7mk, It worked for me, are you sure you made a clean parse (-f flag), so all files are reparsed?

However, a segmentation fault occurred during parsing, so most likely the returned nullptr was dereferenced somewhere. Usages of the SourceManager::getFile() method should be checked.

ervin7mk commented 1 year ago

Hi @mcserep , Oh, I wasn't aware of the -f flag. Now I get the segmentation fault message as well. This explains a lot. Thanks! I will proceed with checking all uses of SourceManager::getFile() .

ervin7mk commented 1 year ago

Hi All! I've checked all occurences of SourceManager::getFile() and it looks like the folder (ParseFromHere) from which I parsed the project no longer appears in the file manager in CodeCompass, neither in the database. Please review my code and let me know if anything more needs to be added. Link to commit: https://github.com/Ericsson/CodeCompass/commit/9e4f9b97d2a9ec1468cd67fbee0597ed5b194194 Screenshots: parsefromhere_gone parsefromhere_gone_db