tomchavakis / nuget-license

.NET Core tool to print or save all the licenses of a project
Apache License 2.0
280 stars 81 forks source link

Option to include full license texts in licenses.txt output #81

Open dennisverheijen opened 3 years ago

dennisverheijen commented 3 years ago

A feature request:

For our projects the desired output is a single "licenses.txt"-style file that contains the actual texts of the licenses for each library. For each library, the name and version is also needed; the other properties like project URL and Description are not mandatory for our use case.

We could use the --export-license-texts option and then concatenate all these files and prefixing each one with the filename. However, most of the 'nuget downloaded licenses' contain HTML markup, example:

<!DOCTYPE html>

<html lang="en">
<head>
    <link rel="stylesheet" href="/Content/Site.css" />
        <title>&#39;BSD-3-Clause&#39; reference</title>
</head>
<body>
    <div id="main-content">

<h2>SPDX identifier</h2>
<p>BSD-3-Clause</p>

<h2>License text</h2>
<pre>_____

Redistribution and use in source and binary forms, with or without modification, are permitted provided that the following conditions are met:

   1. Redistributions of source code must retain the above copyright notice, this list of conditions and the following disclaimer.

   2. Redistributions in binary form must reproduce the above copyright notice, this list of conditions and the following disclaimer in the documentation and/or other materials provided with the distribution.

   3. _____ be used to endorse or promote products derived from this software without specific prior written permission.

THIS SOFTWARE IS PROVIDED BY _____ &quot;AS IS&quot; AND ANY _____ OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL _____ BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
 </pre>

    </div>
</body>
</html>

This markup looks a bit odd in a .txt file, so ideally it would just use the actual text content of this HTML file.

Is there any way this can already be achieved with the current version of your tool?

danielo-unity3d commented 3 years ago

We have the exact same problem

tomchavakis commented 3 years ago

Hi @danielo-unity3d and @dennisverheijen ,

Thank you for this feature request. Unfortunately, this feature is not covered yet. What do you think if a new flag removes the HTML tags?

danielo-unity3d commented 3 years ago

I think that (even if we don't want everything in one file) if we are writing a txt file, then the HTML tags should automatically be stripped (without having to flip a switch).

tomchavakis commented 3 years ago

I agree with you, it's a nice feature. I will try to implement it asap. If you like to contribute to the project is more than welcome :)

Lexy2 commented 2 years ago

Just stripping HTML tags is not a trivial feature, alas. In some cases the license text also gets stripped. In many cases the page headers are left. For example: https://opensource.org/licenses/mit-license.php Which variant would work best here?