Open chinyeungli opened 6 years ago
The functionality you are looking for is relating to another ticket, https://github.com/nexB/deltacode/issues/4
Since you’ve nested the jar file one level lower the second time, deltacode infers this as a removal followed by an addition, when in reality that file has just been moved to another location.
I’ll keep this ticket open for reference and close it when #4 gets taken care of.
@majurg not exactly the same issue.
For instance,
In (1),
I have
/d1/balloontip-1.1.1.jar
/d2/balloontip-1.1.1.jar
Then, the deltacode return d1
and d2
is the same which is correct.
in case (2)
I have
/d1/balloontip-1.1.1.jar
/d1/test/balloontip-1.1.1.jar
/d2/balloontip-1.1.1.jar
which I expect the deltacode will tell me the /d1/balloontip-1.1.1.jar
and /d2/balloontip-1.1.1.jar
are the same and /d1/test/balloontip-1.1.1.jar
is new.
However, the tool tells me BOTH
/d1/balloontip-1.1.1.jar
/d1/test/balloontip-1.1.1.jar
are added
and
/d2/balloontip-1.1.1.jar
is removed.
So, my question is why adding a new directory make the "suppose to be the same" originally become "added/removed"
@majurg Here's what we have after scanning and running DeltaCode on these 3 pairs of test codebases. The results don't seem to be entirely consistent. Putting aside the fact that we currently treat files as moved
only when there's a single identical added
and removed
file, I think the inconsistent treatment arises at least in part from the way we remove path
segments during the fix_trees()
/align_trees()
process.
d1
to d2
.
DeltaCode treats this as unmodified
.
{
"deltacode_version": "0.0.1.beta",
"deltacode_stats": {
"added": 0,
"modified": 0,
"moved": 0,
"removed": 0,
"unmodified": 1
},
"deltas": [
{
"category": "unmodified",
"path": "balloontip-1.1.1.jar",
"name": "balloontip-1.1.1.jar",
"type": "file",
"size": 53842
}
]
}
test
subdirectory to d1
, also containing balloontip-1.1.1.jar
.
DeltaCode treats this as 2 removed
, 1 added
.
{
"deltacode_version": "0.0.1.beta",
"deltacode_stats": {
"added": 1,
"modified": 0,
"moved": 0,
"removed": 2,
"unmodified": 0
},
"deltas": [
{
"category": "added",
"path": "balloontip_new_test_subdirectory_new/d2/balloontip-1.1.1.jar",
"name": "balloontip-1.1.1.jar",
"type": "file",
"size": 53842
},
{
"category": "removed",
"path": "balloontip_new_test_subdirectory_old/d1/balloontip-1.1.1.jar",
"name": "balloontip-1.1.1.jar",
"type": "file",
"size": 53842
},
{
"category": "removed",
"path": "balloontip_new_test_subdirectory_old/d1/test/balloontip-1.1.1.jar",
"name": "balloontip-1.1.1.jar",
"type": "file",
"size": 53842
}
]
}
root
directory above d1
.
DeltaCode treats this as unmodified
.
{
"deltacode_version": "0.0.1.beta",
"deltacode_stats": {
"added": 0,
"modified": 0,
"moved": 0,
"removed": 0,
"unmodified": 1
},
"deltas": [
{
"category": "unmodified",
"path": "balloontip-1.1.1.jar",
"name": "balloontip-1.1.1.jar",
"type": "file",
"size": 53842
}
]
}
@chinyeungli @johnmhoran @majurg how about we ignore the align scans for now, after the integration of VirtualCodebse? Maybe we could follow up in a separate branch apart from the main branch.
Let's consider a directory structure as :
New directory
Old Directory
Now if the a1.py
is having the same sha1
we are treating them as unchanged files, we completely ignore that their main directories are different. Mainly owing to (alignscans / fix trees
).
But instead of that what if we do not allow changing the main root directory?
Now what I propose is we should also treat them as per their main root directory (not just their sub dir
).
If we do so we do not need the extra burden of aligning the scans .
The scans will be aligned as they are loaded from the Virtual Codebase
as a resource
objects.
We can safely ignore all aligning.
Now for a file to have the status as unmodified
it must have the same path(full path) along with the same sha1.
for a file of status moved it must exist in some other subdirs in new_scan
along with that it must have the same sha1
.
And so on for other status ....
And also the codebase would be a lot cleaner than now
@chinyeungli @johnmhoran @majurg need your views upon this
@Pratikrocks I agree with removing/ignoring alignment for the first implementation of adding virtualcodebase.
Okay
I did some simple tests and here is my finding:
I use "balloontip-1.1.1.jar" as a sample file.
Created 2 directories
d1/
andd2/
and put the test file in it and then compare these 2 directories. The output is unchanged which is correct.Same setup as (1) but create a new subdirectory named
test/
underd1/
and put balloontip-1.1.1.jar in it. Both thed1/balloontip-1.1.1.jar
d1/test/balloontip-1.1.1.jar
are returned as added.and the
d2/balloontip-1.1.1.jar
is returned asremoved
which is not correct as the
d1/balloontip-1.1.1.jar
andd2/balloontip-1.1.1.jar
should return unchanged while thed1/test/balloontip-1.1.1.jar
is consider as added.root/
directory and put thed1/
in it and run the deltacode fromroot/
tod2/
. The output is unchanged which is correct.