blackducksoftware / ohcount4j

Line counting and language identification tool
Apache License 2.0
18 stars 8 forks source link

Symbolic links in source trees cause ohcount to double-count source files #17

Closed dianedownie closed 6 years ago

dianedownie commented 8 years ago

If a source tree contains symbolic links, ohcount4j will compute statistics for each path where it finds the file, thus over-representing the amount of source code for that file. Example project: https://github.com/schubergphilis/Seccubus_v2.git The project contains 11 files called table.js

$ find -name "table.js"
./jmvc/seccubus/notification/table/table.js
./jmvc/seccubus/issue/table/table.js
./jmvc/seccubus/workspace/table/table.js
./jmvc/seccubus/custsql/table/table.js
./jmvc/seccubus/finding/table/table.js
./jmvc/seccubus/history/table/table.js
./jmvc/seccubus/scan/table/table.js
./jmvc/seccubus/issuelink/table/table.js
./jmvc/seccubus/asset/host/table/table.js
./jmvc/seccubus/asset/table/table.js
./jmvc/seccubus/run/table/table.js

Ohcount4j finds twice that because dev is linked to jmvc

ls -lq www/dev
lrwxrwxrwx 1 ddownie ddownie 8 Mar 28 16:53 www/dev -> ../jmvc/

/ohcount4j -d ../scan/Seccubus_v2 | grep -w "table.js"
JavaScript  ../scan/Seccubus_v2/jmvc/seccubus/asset/host/table/table.js
JavaScript  ../scan/Seccubus_v2/jmvc/seccubus/asset/table/table.js
JavaScript  ../scan/Seccubus_v2/jmvc/seccubus/custsql/table/table.js
JavaScript  ../scan/Seccubus_v2/jmvc/seccubus/finding/table/table.js
JavaScript  ../scan/Seccubus_v2/jmvc/seccubus/history/table/table.js
JavaScript  ../scan/Seccubus_v2/jmvc/seccubus/issue/table/table.js
JavaScript  ../scan/Seccubus_v2/jmvc/seccubus/issuelink/table/table.js
JavaScript  ../scan/Seccubus_v2/jmvc/seccubus/notification/table/table.js
JavaScript  ../scan/Seccubus_v2/jmvc/seccubus/run/table/table.js
JavaScript  ../scan/Seccubus_v2/jmvc/seccubus/scan/table/table.js
JavaScript  ../scan/Seccubus_v2/jmvc/seccubus/workspace/table/table.js
JavaScript  ../scan/Seccubus_v2/www/dev/seccubus/asset/host/table/table.js
JavaScript  ../scan/Seccubus_v2/www/dev/seccubus/asset/table/table.js
JavaScript  ../scan/Seccubus_v2/www/dev/seccubus/custsql/table/table.js
JavaScript  ../scan/Seccubus_v2/www/dev/seccubus/finding/table/table.js
JavaScript  ../scan/Seccubus_v2/www/dev/seccubus/history/table/table.js
JavaScript  ../scan/Seccubus_v2/www/dev/seccubus/issue/table/table.js
JavaScript  ../scan/Seccubus_v2/www/dev/seccubus/issuelink/table/table.js
JavaScript  ../scan/Seccubus_v2/www/dev/seccubus/notification/table/table.js
JavaScript  ../scan/Seccubus_v2/www/dev/seccubus/run/table/table.js
JavaScript  ../scan/Seccubus_v2/www/dev/seccubus/scan/table/table.js
JavaScript  ../scan/Seccubus_v2/www/dev/seccubus/workspace/table/table.js

So this results in the summary statistics being skewed:

$:~/scan/Seccubus_v2$ find -type f | wc -l
1509
$ ./ohcount4j ../scan/Seccubus_v2/
                            Ohcount4j Line Count Summary

Language                  Files       Code    Comment  Comment %      Blank      Total
------------------------  -----  ---------  ---------  ---------  ---------  ---------
JavaScript                 1311     188804      62718      24.9%      35634     287156
Unknown                     711     168361          0       0.0%       8355     176716
XML                          30      30576        268       0.9%       5184      36028
CSS                         298      12028        394       3.2%        524      12946
HTML                        511      11824       3474      22.7%       1519      16817
Perl                         39       4693       2194      31.9%       1843       8730
Windows Batch                 8        268         60      18.3%         54        382
Ruby                          1        268         21       7.3%         35        324
Shell                         3         96         46      32.4%         13        155
Python                        1         72         21      22.6%         12        105
------------------------  -----  ---------  ---------  ---------  ---------  ---------
Total                      2410     416990      69196      14.2%      53173     539359
PDegenPortnoy commented 6 years ago

Moving the project to archive status. Therefore, closing all issues.