spdx / tools-python

A Python library to parse, validate and create SPDX documents.
http://spdx.org
Apache License 2.0
188 stars 134 forks source link

Review and possibly update JSON and YAML file generation to match the SPDX 2.2 spec proposals #123

Closed goneall closed 1 year ago

goneall commented 5 years ago

Several changes to the JSON and YAML formats were discussed and generally agreed on for the SPDX 2.2 spec.

There is a PR with changes to the example file: https://github.com/spdx/spdx-spec/pull/149. The PR documents the related issues which were resolved (sorry for the extra clicks to find get all the documentation).

The Python libraries may need to be updated to match the 2.2 spec.

Yash-Varshney commented 4 years ago

Hi @goneall . I am interested in updating spdx-python-tools to support spdx v2.2 Can you describe a little bit more about the problem and things needed to be updated further? Thanks for your time.

goneall commented 4 years ago

One suggested approach is to convert the tag/value example file to a JSON using the Python libraries and compare that to the JSON example file. Analyzing the difference should provide a guide as to what needs to be updated.

nicoweidner commented 1 year ago

I used the 2.2 tag example from spdx-spec and converted it to json using the python tools. The converted file is attached.

Using the json-based comparison tooling from spdx-testbed, the following differences were detected:

[ListDifference(actualValue="Organization: ExampleCodeInspect", expectedValue=null, path=/creationInfo/creators/1, pathInReferenceDoc=/creationInfo/creators, comment=No element in expected list with a matching Spdx id or no Spdx id present.),
    ListDifference(actualValue="Person: Jane Doe", expectedValue=null, path=/creationInfo/creators/2, pathInReferenceDoc=/creationInfo/creators, comment=No element in expected list with a matching Spdx id or no Spdx id present.),
    ListDifference(actualValue=null, expectedValue="Organization: ExampleCodeInspect ()", path=/creationInfo/creators, pathInReferenceDoc=/creationInfo/creators/1, comment=No element in actual list with a matching Spdx id or no Spdx id present.),
    ListDifference(actualValue=null, expectedValue="Person: Jane Doe ()", path=/creationInfo/creators, pathInReferenceDoc=/creationInfo/creators/2, comment=No element in actual list with a matching Spdx id or no Spdx id present.),
    ListDifference(actualValue={"licenseId":"LicenseRef-1","extractedText":"/*\n * (c) Copyright 2000, 2001, 2002, 2003, 2004, 2005, 2006, 2007, 2008, 2009 Hewlett-Packard Development Company, LP\n * All rights reserved.\n *\n * Redistribution and use in source and binary forms, with or without\n * modification, are permitted provided that the following conditions\n * are met:\n * 1. Redistributions of source code must retain the above copyright\n *    notice, this list of conditions and the following disclaimer.\n * 2. Redistributions in binary form must reproduce the above copyright\n *    notice, this list of conditions and the following disclaimer in the\n *    documentation and/or other materials provided with the distribution.\n * 3. The name of the author may not be used to endorse or promote products\n *    derived from this software without specific prior written permission.\n *\n * THIS SOFTWARE IS PROVIDED BY THE AUTHOR ``AS IS'' AND ANY EXPRESS OR\n * IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES\n * OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED.\n * IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR ANY DIRECT, INDIRECT,\n * INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT\n * NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,\n * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY\n * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT\n * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF\n * THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.\n*/","name":"LicenseRef-1"}, expectedValue=null, path=/hasExtractedLicensingInfos/0, pathInReferenceDoc=/hasExtractedLicensingInfos, comment=No element in expected list with a matching Spdx id or no Spdx id present.),
    ListDifference(actualValue={"licenseId":"LicenseRef-2","extractedText":"This package includes the GRDDL parser developed by Hewlett Packard under the following license:\n� Copyright 2007 Hewlett-Packard Development Company, LP\n\nRedistribution and use in source and binary forms, with or without modification, are permitted provided that the following conditions are met: \n\nRedistributions of source code must retain the above copyright notice, this list of conditions and the following disclaimer. \nRedistributions in binary form must reproduce the above copyright notice, this list of conditions and the following disclaimer in the documentation and/or other materials provided with the distribution. \nThe name of the author may not be used to endorse or promote products derived from this software without specific prior written permission. \nTHIS SOFTWARE IS PROVIDED BY THE AUTHOR ``AS IS'' AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.","name":"LicenseRef-2"}, expectedValue=null, path=/hasExtractedLicensingInfos/1, pathInReferenceDoc=/hasExtractedLicensingInfos, comment=No element in expected list with a matching Spdx id or no Spdx id present.),
    ListDifference(actualValue={"licenseId":"LicenseRef-4","extractedText":"/*\n * (c) Copyright 2009 University of Bristol\n * All rights reserved.\n *\n * Redistribution and use in source and binary forms, with or without\n * modification, are permitted provided that the following conditions\n * are met:\n * 1. Redistributions of source code must retain the above copyright\n *    notice, this list of conditions and the following disclaimer.\n * 2. Redistributions in binary form must reproduce the above copyright\n *    notice, this list of conditions and the following disclaimer in the\n *    documentation and/or other materials provided with the distribution.\n * 3. The name of the author may not be used to endorse or promote products\n *    derived from this software without specific prior written permission.\n *\n * THIS SOFTWARE IS PROVIDED BY THE AUTHOR ``AS IS'' AND ANY EXPRESS OR\n * IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES\n * OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED.\n * IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR ANY DIRECT, INDIRECT,\n * INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT\n * NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,\n * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY\n * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT\n * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF\n * THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.\n*/","name":"LicenseRef-4"}, expectedValue=null, path=/hasExtractedLicensingInfos/2, pathInReferenceDoc=/hasExtractedLicensingInfos, comment=No element in expected list with a matching Spdx id or no Spdx id present.),
    ListDifference(actualValue={"licenseId":"LicenseRef-Beerware-4.2","comment":"The beerware license has a couple of other standard variants.","extractedText":"\"THE BEER-WARE LICENSE\" (Revision 42):\nphk@FreeBSD.ORG wrote this file. As long as you retain this notice you\ncan do whatever you want with this stuff. If we meet some day, and you think this stuff is worth it, you can buy me a beer in return Poul-Henning Kamp","name":"Beer-Ware License (Version 42)","seeAlss":["http://people.freebsd.org/~phk/"]}, expectedValue=null, path=/hasExtractedLicensingInfos/3, pathInReferenceDoc=/hasExtractedLicensingInfos, comment=No element in expected list with a matching Spdx id or no Spdx id present.),
    ListDifference(actualValue={"licenseId":"LicenseRef-3","comment":"This is tye CyperNeko License","extractedText":"The CyberNeko Software License, Version 1.0\n\n \n(C) Copyright 2002-2005, Andy Clark.  All rights reserved.\n \nRedistribution and use in source and binary forms, with or without\nmodification, are permitted provided that the following conditions\nare met:\n\n1. Redistributions of source code must retain the above copyright\n   notice, this list of conditions and the following disclaimer. \n\n2. Redistributions in binary form must reproduce the above copyright\n   notice, this list of conditions and the following disclaimer in\n   the documentation and/or other materials provided with the\n   distribution.\n\n3. The end-user documentation included with the redistribution,\n   if any, must include the following acknowledgment:  \n     \"This product includes software developed by Andy Clark.\"\n   Alternately, this acknowledgment may appear in the software itself,\n   if and wherever such third-party acknowledgments normally appear.\n\n4. The names \"CyberNeko\" and \"NekoHTML\" must not be used to endorse\n   or promote products derived from this software without prior \n   written permission. For written permission, please contact \n   andyc@cyberneko.net.\n\n5. Products derived from this software may not be called \"CyberNeko\",\n   nor may \"CyberNeko\" appear in their name, without prior written\n   permission of the author.\n\nTHIS SOFTWARE IS PROVIDED ``AS IS'' AND ANY EXPRESSED OR IMPLIED\nWARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES\nOF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE\nDISCLAIMED.  IN NO EVENT SHALL THE AUTHOR OR OTHER CONTRIBUTORS\nBE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, \nOR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT \nOF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR \nBUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, \nWHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE \nOR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, \nEVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.","name":"CyberNeko License","seeAlss":["http://people.apache.org/~andyc/neko/LICENSE, http://justasample.url.com"]}, expectedValue=null, path=/hasExtractedLicensingInfos/4, pathInReferenceDoc=/hasExtractedLicensingInfos, comment=No element in expected list with a matching Spdx id or no Spdx id present.),
    ListDifference(actualValue=null, expectedValue={"licenseId":"LicenseRef-1","extractedText":"/*\n * (c) Copyright 2000, 2001, 2002, 2003, 2004, 2005, 2006, 2007, 2008, 2009 Hewlett-Packard Development Company, LP\n * All rights reserved.\n *\n * Redistribution and use in source and binary forms, with or without\n * modification, are permitted provided that the following conditions\n * are met:\n * 1. Redistributions of source code must retain the above copyright\n *    notice, this list of conditions and the following disclaimer.\n * 2. Redistributions in binary form must reproduce the above copyright\n *    notice, this list of conditions and the following disclaimer in the\n *    documentation and/or other materials provided with the distribution.\n * 3. The name of the author may not be used to endorse or promote products\n *    derived from this software without specific prior written permission.\n *\n * THIS SOFTWARE IS PROVIDED BY THE AUTHOR ``AS IS'' AND ANY EXPRESS OR\n * IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES\n * OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED.\n * IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR ANY DIRECT, INDIRECT,\n * INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT\n * NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,\n * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY\n * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT\n * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF\n * THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.\n*/"}, path=/hasExtractedLicensingInfos, pathInReferenceDoc=/hasExtractedLicensingInfos/0, comment=No element in actual list with a matching Spdx id or no Spdx id present.),
    ListDifference(actualValue=null, expectedValue={"licenseId":"LicenseRef-2","extractedText":"This package includes the GRDDL parser developed by Hewlett Packard under the following license:\n� Copyright 2007 Hewlett-Packard Development Company, LP\n\nRedistribution and use in source and binary forms, with or without modification, are permitted provided that the following conditions are met: \n\nRedistributions of source code must retain the above copyright notice, this list of conditions and the following disclaimer. \nRedistributions in binary form must reproduce the above copyright notice, this list of conditions and the following disclaimer in the documentation and/or other materials provided with the distribution. \nThe name of the author may not be used to endorse or promote products derived from this software without specific prior written permission. \nTHIS SOFTWARE IS PROVIDED BY THE AUTHOR ``AS IS'' AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE."}, path=/hasExtractedLicensingInfos, pathInReferenceDoc=/hasExtractedLicensingInfos/1, comment=No element in actual list with a matching Spdx id or no Spdx id present.),
    ListDifference(actualValue=null, expectedValue={"licenseId":"LicenseRef-4","extractedText":"/*\n * (c) Copyright 2009 University of Bristol\n * All rights reserved.\n *\n * Redistribution and use in source and binary forms, with or without\n * modification, are permitted provided that the following conditions\n * are met:\n * 1. Redistributions of source code must retain the above copyright\n *    notice, this list of conditions and the following disclaimer.\n * 2. Redistributions in binary form must reproduce the above copyright\n *    notice, this list of conditions and the following disclaimer in the\n *    documentation and/or other materials provided with the distribution.\n * 3. The name of the author may not be used to endorse or promote products\n *    derived from this software without specific prior written permission.\n *\n * THIS SOFTWARE IS PROVIDED BY THE AUTHOR ``AS IS'' AND ANY EXPRESS OR\n * IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES\n * OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED.\n * IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR ANY DIRECT, INDIRECT,\n * INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT\n * NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,\n * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY\n * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT\n * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF\n * THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.\n*/"}, path=/hasExtractedLicensingInfos, pathInReferenceDoc=/hasExtractedLicensingInfos/2, comment=No element in actual list with a matching Spdx id or no Spdx id present.),
    ListDifference(actualValue=null, expectedValue={"licenseId":"LicenseRef-Beerware-4.2","comment":"The beerware license has a couple of other standard variants.","extractedText":"\"THE BEER-WARE LICENSE\" (Revision 42):\nphk@FreeBSD.ORG wrote this file. As long as you retain this notice you\ncan do whatever you want with this stuff. If we meet some day, and you think this stuff is worth it, you can buy me a beer in return Poul-Henning Kamp","name":"Beer-Ware License (Version 42)","seeAlsos":["http://people.freebsd.org/~phk/"]}, path=/hasExtractedLicensingInfos, pathInReferenceDoc=/hasExtractedLicensingInfos/3, comment=No element in actual list with a matching Spdx id or no Spdx id present.),
    ListDifference(actualValue=null, expectedValue={"licenseId":"LicenseRef-3","comment":"This is tye CyperNeko License","extractedText":"The CyberNeko Software License, Version 1.0\n\n \n(C) Copyright 2002-2005, Andy Clark.  All rights reserved.\n \nRedistribution and use in source and binary forms, with or without\nmodification, are permitted provided that the following conditions\nare met:\n\n1. Redistributions of source code must retain the above copyright\n   notice, this list of conditions and the following disclaimer. \n\n2. Redistributions in binary form must reproduce the above copyright\n   notice, this list of conditions and the following disclaimer in\n   the documentation and/or other materials provided with the\n   distribution.\n\n3. The end-user documentation included with the redistribution,\n   if any, must include the following acknowledgment:  \n     \"This product includes software developed by Andy Clark.\"\n   Alternately, this acknowledgment may appear in the software itself,\n   if and wherever such third-party acknowledgments normally appear.\n\n4. The names \"CyberNeko\" and \"NekoHTML\" must not be used to endorse\n   or promote products derived from this software without prior \n   written permission. For written permission, please contact \n   andyc@cyberneko.net.\n\n5. Products derived from this software may not be called \"CyberNeko\",\n   nor may \"CyberNeko\" appear in their name, without prior written\n   permission of the author.\n\nTHIS SOFTWARE IS PROVIDED ``AS IS'' AND ANY EXPRESSED OR IMPLIED\nWARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES\nOF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE\nDISCLAIMED.  IN NO EVENT SHALL THE AUTHOR OR OTHER CONTRIBUTORS\nBE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, \nOR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT \nOF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR \nBUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, \nWHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE \nOR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, \nEVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.","name":"CyberNeko License","seeAlsos":["http://people.apache.org/~andyc/neko/LICENSE","http://justasample.url.com"]}, path=/hasExtractedLicensingInfos, pathInReferenceDoc=/hasExtractedLicensingInfos/4, comment=No element in actual list with a matching Spdx id or no Spdx id present.),
    ListDifference(actualValue={"annotationDate":"2010-01-29T18:30:22Z","annotationType":"OTHER","annotator":"Person: Jane Doe","comment":"Document level annotation"}, expectedValue=null, path=/annotations/0, pathInReferenceDoc=/annotations, comment=No element in expected list with a matching Spdx id or no Spdx id present.),
    ListDifference(actualValue=null, expectedValue={"annotationDate":"2010-01-29T18:30:22Z","annotationType":"OTHER","annotator":"Person: Jane Doe ()","comment":"Document level annotation"}, path=/annotations, pathInReferenceDoc=/annotations/0, comment=No element in actual list with a matching Spdx id or no Spdx id present.),
    ListDifference(actualValue=null, expectedValue=true, path=/packages/0/filesAnalyzed, pathInReferenceDoc=/packages/0/filesAnalyzed, comment=null)]

Using the comparison tooling from spdx-java-library, I didn't find a way so far to list all the detected differences in a nice way. So I will list them manually:

Since this post is already quite long, I will do an analysis of the detected differences in a separate post.

converted.zip

nicoweidner commented 1 year ago

Analysis of the differences detected by the json comparison:

One comment beforehand: At the moment, the comparison works by trying to find an exact match and alternatively trying to match by Spdx id if present. If none of the two methods work, it's not matched with anything (leading to two differences, where the respective "other" value is null)

goneall commented 1 year ago

@nicoweidner Based on your analysis, it looks like all Python library related issues have been resolved. It looks like all of the differences listed above are not related to variations from the spec.

If you agree, I think we can close this issue.

nicoweidner commented 1 year ago

Sorry, got interrupted :sweat_smile: . The only thing that may remain questionable in the above list is the fact that the Python tools set the license name to the id if no name is provided and it's not a known license from the list. I am not sure where that part of the logic comes from, and I can't find anything in the spec relating to it. I will create an issue and link it above.

I will also post a quick analysis of the differences detected by the java library, but I think they are pretty much the same (apart from that one case where I don't understand what the difference is supposed to be). Then I will close this.

nicoweidner commented 1 year ago

Analysis of the differences detected by the java library:

In conclusion, I would say all remaining differences are quite minor, and several of them are not even caused by the Python tools behaving incorrectly. As there are already issues to track everything, I'll close this one.