Closed JonoYang closed 6 years ago
@JonoYang This is highly speculative, but at first blush it looks like the file count is off, perhaps reminiscent of the missing files issue we experienced a few weeks back. As I recall, that had something to do with how the codebases for the pair of scans were defined.
Is that a fair description of that cause? And is there a chance that your eCos
scans bear some similarity to that earlier set of scans?
Yes something weird is going on here; I will take a look.
Looks like there is some discrepancy between files_count
and len(index)
for some scans.
Still hammering down the details but we atleast have some initial tests written that reproduce this behavior.
At first glance, it looks like there is some confusion during the index process that fails to index paths that have been aligned to ''
This happens after scan alignment.
Ok, have figured out the main cause: We are experience hash collisions during our file indexing.
This was expected for things like sha1
indexing etc, but I underestimated that it could happen to path as well (especially since we align_scan
etc during the delta).
So, this fix will require a bit more work than anticipated, but handling it will allow us to tackle other problems easier (moved files etc). We would have had to make this change at some point, so we are not in a bad place.
Fixed with #18
I ran ScanCode with the the following options (
-clipeu
) on version 2.0 of eCos and the latest HEAD of the eCos CVS repo. After, I ran DeltaCode on the report files and I got the following issue:Attached are the input files I used to get this error: ecos-scans.zip