aboutcode-org / scancode-toolkit

:mag: ScanCode detects licenses, copyrights, dependencies by "scanning code" ... to discover and inventory open source and third-party packages used in your code. Sponsored by NLnet project https://nlnet.nl/project/vulnerabilitydatabase, the Google Summer of Code, Azure credits, nexB and others generous sponsors!
https://aboutcode.org/scancode/
2.14k stars 552 forks source link

Have "All rights reserved." contained in json output #1542

Open yiting-wang33 opened 5 years ago

yiting-wang33 commented 5 years ago

Description

I'm using Scancode to detect the copyright in a repo. I have the copyright pattern as:

(c) Copyright XXXX Ltd. 20* All rights reserved.

*is the year Currently in json result, the copyright value is:

 "copyrights": [
        {
          "value": "(c) Copyright XXXX Ltd. 20*",  
          "start_line": 2,
          "end_line": 2
        }
      ],

I expect the copyright statement would end up with "All rights reserved." So, it should look like:

 "copyrights": [
        {
          "value": "(c) Copyright XXXX Ltd. 20* All rights reserved.",  
          "start_line": 2,
          "end_line": 2
        }
      ],

I notice #559 , but what should I do to have that picked up? Thank you!

pombredanne commented 5 years ago

@yiting-wang33 as I explained in https://github.com/nexB/scancode-toolkit/issues/559#issuecomment-286751659 we explicitly do not capture these as they are considered obsolete per https://en.wikipedia.org/wiki/All_rights_reserved#Obsolescence.

It would be possible to capture them and that would require changes to the part of speech tagger and grammar we use to detect and parse copyright statements in https://github.com/nexB/scancode-toolkit/blob/develop/src/cluecode/copyrights.py

Is this something that you would like to tackle and contribute to the project?

pombredanne commented 3 years ago

Note that at this stage we do collect All rights reserved suffixes AND we do not return them by default. This is an optional argument in the copyright detection function. This is not surfaced in the UI CLI options though.