aboutcode-org / deltacode

DeltaCode: compare two codebase scans (from ScanCode) to detect significant changes.
http://www.aboutcode.org/
20 stars 27 forks source link

DeltaCode not handling copyright holders properly #127

Open JonoYang opened 5 years ago

JonoYang commented 5 years ago

I am doing a delta between a scan created by ScanCode version 2.9.7.post1.81f177e and another scan created by ScanCode version 3.0.2.post1270.ac0e1e184. I am encountering the following error:

Traceback (most recent call last):
  File "/home/jono/nexb/src/deltacode/bin/deltacode", line 11, in <module>
    load_entry_point('deltacode', 'console_scripts', 'deltacode')()
  File "/home/jono/nexb/src/deltacode/local/lib/python2.7/site-packages/click/core.py", line 722, in __call__
    return self.main(*args, **kwargs)
  File "/home/jono/nexb/src/deltacode/local/lib/python2.7/site-packages/click/core.py", line 697, in main
    rv = self.invoke(ctx)
  File "/home/jono/nexb/src/deltacode/local/lib/python2.7/site-packages/click/core.py", line 895, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "/home/jono/nexb/src/deltacode/local/lib/python2.7/site-packages/click/core.py", line 535, in invoke
    return callback(*args, **kwargs)
  File "/home/jono/nexb/src/deltacode/src/deltacode/cli.py", line 90, in cli
    deltacode = DeltaCode(new, old, options)
  File "/home/jono/nexb/src/deltacode/src/deltacode/__init__.py", line 61, in __init__
    self.copyright_diff()
  File "/home/jono/nexb/src/deltacode/src/deltacode/__init__.py", line 227, in copyright_diff
    utils.update_from_copyright_info(delta)
  File "/home/jono/nexb/src/deltacode/src/deltacode/utils.py", line 127, in update_from_copyright_info
    update_modified_from_copyright_info(delta)
  File "/home/jono/nexb/src/deltacode/src/deltacode/utils.py", line 160, in update_modified_from_copyright_info
    new_holders = set(holder for copyright in new_copyrights for holder in copyright.holders)
  File "/home/jono/nexb/src/deltacode/src/deltacode/utils.py", line 160, in <genexpr>
    new_holders = set(holder for copyright in new_copyrights for holder in copyright.holders)
TypeError: 'NoneType' object is not iterable
JonoYang commented 5 years ago

Copyright holders is now on the same level as copyrights in ABCD instead of being contained within copyrights

arturrz commented 3 years ago

I'm currently facing this issue. When testing deltacode with: ./deltacode -n samples/samples.json -o samples/samples.json -j output.json I get no errors; but when testing with my two scancodeoutput(new/old).json files I get the same issue. I was wondering if there's already a solution for this?

And, what do you mean in your last comment?

Pratikrocks commented 3 years ago

Hi @arturrz if possible can you post the two scancode_output_(new/old).json json files ?

arturrz commented 3 years ago

Hi @Pratikrocks, sure.

scancode_unmod.txt

scancode_mod.txt

Pratikrocks commented 3 years ago

@arturrz its failing only for scan code scans which is having summary as set

Pratikrocks commented 3 years ago

@JonoYang @arturrz thanks for pointing this bug. I fixed it. I will integrate it into one of my PR. :)

arturrz commented 3 years ago

No problem, thanks for the rapid response! @Pratikrocks

arturrz commented 3 years ago

Hi, rescanned both source repositories using: ~/tools/scancode-toolkit-21.2.9/scancode -n 4 -lpcieu --json-pp scancode_mod.json <source_dir>/ And the issue persists. :(