spdx / tools

SPDX Tools
Apache License 2.0
126 stars 69 forks source link

Formatting of plain license text in JSON data is broken #162

Closed sschuberth closed 6 years ago

sschuberth commented 6 years ago

At the example of Apache-2.0, when extracting the licenseText string to a file, I'd expect that file to be exactly formatted like the original plain text license including leading spaces and blank lines. However, the JSON string is formatted like

Apache License

Version 2.0, January 2004

http://www.apache.org/licenses/ TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION

   1. Definitions.

      "License" shall mean the terms and conditions for use, reproduction, and distribution as defined by Sections 1 through 9 of this document.

      "Licensor" shall mean the copyright owner or entity authorized by the copyright owner that is granting the License.

(note the missing leading spaces but added trailing spaces) which not only does not match the original text but also is quite ugly.

goneall commented 6 years ago

The way we are maintaining the license information in the license-list-XML github repository it is not feasible to retain the formatting of the original text since XML removes the white space and we do not have enough tags to retain all of the formatting.

That being said, we could do a better job of formatting the text and making it look prettier.

The code for this has actually moved to a different project: LicenseListPublish

goneall commented 6 years ago

Moving issue to LicenseListPublisher: https://github.com/spdx/license-list-XML/issues/1924