Ericsson / codechecker

CodeChecker is an analyzer tooling, defect database and viewer extension for static and dynamic analyzer tools.
https://codechecker.readthedocs.io
Apache License 2.0
2.27k stars 383 forks source link

Analysis results containing Windows file paths not handled by `server` #3654

Open SevHub opened 2 years ago

SevHub commented 2 years ago

Hello, I am trying to use CodeChecker store to send the Reports to an server but that does not work as expected:

I do the analysis on a Windows 10 PC with clang-tidy. The clang tidy report is converted with the converter tool (report-converter) of this project into plist files. The plist files contain references to the source files with the whole path, e.g.:

    <array>
        <string>H:\repos\projects\_BUS\CPU_WinPC\Software\Neutralproject\rc_lib\Ctr_lib\inc\ctr_bas.h</string>
        <string>H:\repos\projects\_BUS\CPU_WinPC\Software\Neutralproject\soft.xxx\CTR\APPL\projectdefines.h</string>
        <string>H:\repos\projects\_BUS\CPU_WinPC\Software\Neutralproject\soft.xxx\CTR\System\ce.c</string>
    </array>

Then I upload the reports to a Ubuntu machine running a CodeChecker server, by using the command CodeChecker store. The process looks good on the client side, no errors are reported. On the server a run is created, but it is basically empty, containing no information. In the server log I find a lot of these Errors:

[ERROR][2022-04-20 13:25:04] {server} [25289] <140079552255808> - mass_store_run.py:409 __store_source_files() - File ID for /home/sev/.codechecker/tmptwoeu6ad/root/H:\repos\projects\_BUS\CPU_WinPC\Software\Neutralproject\soft.xxx\CTR\System\vis.c is not found in the DB with content hash 8f1ee6035bffbd31504fff06ae7b62f8b937deddb4fca53f54b26851c1ec4a85. Missing from ZIP?

and later many of these errors appear:

[WARNING][2022-04-20 13:25:05] {server} [25289] <140079552255808> - mass_store_run.py:866 __process_report_file() - Failed to get database id for file path '/home/sev/.codechecker/tmptwoeu6ad/reports/362d7e1b01f2458f4dba9c951e5979f0/H:\repos\projects\_BUS\CPU_WinPC\Software\Neutralproject\soft.xxx\Sim\doktabl.cpp /home/sev/.codechecker/tmptwoeu6ad/reports/362d7e1b01f2458f4dba9c951e5979f0/H:\repos\projects\_BUS\CPU_WinPC\Software\Neutralproject\rc_lib\Ctr_lib\inc\ctr_bas.h'! Skip adding report: /home/sev/.codechecker/tmptwoeu6ad/reports/362d7e1b01f2458f4dba9c951e5979f0/H:\repos\projects\_BUS\CPU_WinPC\Software\Neutralproject\soft.xxx\Sim\doktabl.cpp:587:3 [clang-analyzer-security.insecureAPI.strcpy]

I suspect there is something wrong with the file path prefixes, because the full path of the file from the Windows system appears in the logs of the linux systems. I tried using the --trim-path-prefix option for the store command with different variants of the path prefixes. However, it seems that this changes nothing on the server side, the same errors appear every time.

Is the --trim-path-prefix even working? Am I doing something wrong?

Version infos: CodeChecker Version on Client: Git tag information | 6.19.1 CodeChecker Version on Server: Git tag information | 6.19.1

tru commented 2 years ago

I have the same problem - the reason seems to be that the encoded filenames are not harmonized in anyway. Both where they are added to the zip file and to the JSON data. I am looking into maybe adding a function for harmonizing the filenames for all platforms. I think it makes sense that the path's are all using UNIX style /.

I poke at it a bit and try to get it to work.

ericLemanissier commented 1 year ago

I have just been bitten by this issue too, cf https://github.com/Ericsson/codechecker/issues/3814 If any of you have an idea of something I could try, I'd be happy to !

whisperity commented 1 year ago

It seems spot-rewriting all the paths to be POSIX-y when assembling the ZIP (and also changing the contents of the plists) might be a viable solution. It is bound to get ugly in presentation, however: c:/foo/bar

If we want to get extra pedantic here, we could do what Wine and Lutris are doing and rewrite C:\ as drive_c, and only then normalise the paths replacing \ with /. Extra care needs to be taken for path components that contain spaces or special characters as they need to be escaped on the POSIX side of things.

~/Games $ ls -alh ./dosdevices
c:  ->  ../drive_c
d:  ->  ../drive_d
z:  ->  /

The only big problem site I can envision with this is the "diff a server's contents against a local report directory"...

ericLemanissier commented 1 year ago

that would be great. note that you need to replace the c: part of the path with something like drive_c, because otherwise ZipFile just removes the drive prefix.

Wraiyth commented 4 months ago

We have encountered this issue as our last main blocker to have CodeChecker running on Windows. Is there a suggested fix for this at this stage?