Unidata / netcdf-c

Official GitHub repository for netCDF-C libraries and utilities.
BSD 3-Clause "New" or "Revised" License
514 stars 262 forks source link

License problem: ConvertUTF is non-free, use libicu instead #349

Closed sebastic closed 7 years ago

sebastic commented 7 years ago

The lintian QA tool reported a license problem with the ConvertUTF.{c,h} files included in ncgen (license-problem-convert-utf-code):

The following file source files include material under a non-free license from Unicode Inc. Therefore, it is not possible to ship this in main or contrib.

This license does not grant any permission to modify the files (thus failing DFSG#3). Moreover, the license grant to attempt to restrict use to "products supporting the Unicode Standard" (thus failing DFSG#6).

In this case a solution is to use libicu and to remove this code by repacking.

If this is a false-positive, please report a bug against Lintian.

Refer to https://bugs.debian.org/823100 for details.

Quoting the mentioned Debian Free Software Guidelines (DFSG) paragraphs:

3. Derived Works

The license must allow modifications and derived works, and must allow them to be distributed under the same terms as the license of the original software.

6. No Discrimination Against Fields of Endeavor

The license must not restrict anyone from making use of the program in a specific field of endeavor. For example, it may not restrict the program from being used in a business, or from being used for genetic research.

Please remove the problematic ConvertUTF.{c,h} files and use libicu instead.

DennisHeimbigner commented 7 years ago

Not sure I see this as a problem; AFAIK we have not modified it and since it was included to support utf8 in netcdf-3, it meets that criteria. Is the issue transitivity? That is, that the program using the netcdf-c library only indirectly support utf8 by using netcdf-c? Please elaborate your concerns. In any case, I will look at libicu.

DennisHeimbigner commented 7 years ago

Ok, so after a very quick look, the problem with libicu is that it is serious overkill for our purposes and is way to general. We need something a very small footprint. It appears to me that I will have to do major surgery on the source code to extract just the parts I need. So, this switch would/will take a while; it will not happen any time soon.

WardF commented 7 years ago

I agree libicu is overkill. On Monday I'll take a closer look at the convertutf license and see if there are other alternatives; I'll also contribute to the conversation regarding the potential problem for NetCDF that it may pose.

sebastic commented 7 years ago

The problem with the ConvertUTF code is that its license is incompatible with the license of NetCDF. The NetCDF license explicitly allows modification, which the ConvertUTF license does not.

The ghostscript bugreport linked from the Debian bugreport has more information:

According to http://unicode.org/forum/viewtopic.php?f=9&t=90 - summarized at http://stackoverflow.com/questions/2685004/why-does-unicode-org-no-longer-offer-a-reference-utf-8-16-32-converter . ConvertUTF is obsolete and buggy.

According to discussion at https://lists.debian.org/debian-legal/2006/01/msg00534.html, Richard Stallman and the Unicode consortium has noth acknowledged compatibility issues with licensing of the code - issues has been solved for later code releases issued by the Unicode consortium, but according to https://web.archive.org/web/20081228105917/http://www.unicode.org/Public/PROGRAMS/CVTUTF/ there has been no newer release of ConvertUTF since 2004.

Because NetCDF does not comply with the DFSG due to the inclusion of the ConvertUTF files which don't allow modification, NetCDF and all its reverse dependencies need to be removed from Debian & Ubuntu if this issue is not resolved. Which would be a great disservice to our users.

DennisHeimbigner commented 7 years ago

I found an alternative that claims to be the MIT license. I have attached (below) the actual LICENSE file; Does it look acceptable? =Dennis Heimbigner

Copyright (C) 2014-2016 Quinten Lansu

Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:

The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.

THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.

sebastic commented 7 years ago

Yes, the MIT licensed alternative would be a good replacement (license-wise), since both it and the NetCDF explicitly allow modification and don't contain terms contrary to the other license.

DennisHeimbigner commented 7 years ago

I have just discovered two things.

  1. At some point, the utf8proc license was modified to allow modification
  2. continued development of utf8proc was taken over by the Julia Language developers. My reference is this page: https://github.com/JuliaLang/utf8proc/blob/master/LICENSE.md

I will immediately shift to using this version of utf8proc. Please examine the license on the above referenced web page and let me know if it is satisfactory. =Dennis Heimbigner Unidata

On 1/22/2017 2:56 PM, Bas Couwenberg wrote:

The lintian QA tool reported a license problem with the |ConvertUTF.{c,h}| files included in |ncgen| (license-problem-convert-utf-code https://lintian.debian.org/tags/license-problem-convert-utf-code.html):

The following file source files include material under a non-free
license from Unicode Inc. Therefore, it is not possible to ship this
in main or contrib.

This license does not grant any permission to modify the files (thus
failing DFSG#3). Moreover, the license grant to attempt to restrict
use to "products supporting the Unicode Standard" (thus failing DFSG#6).

In this case a solution is to use libicu and to remove this code by
repacking.

If this is a false-positive, please report a bug against Lintian.

Refer to https://bugs.debian.org/823100 for details.

Quoting the mentioned Debian Free Software Guidelines (DFSG) paragraphs:

*3. Derived Works*

The license must allow modifications and derived works, and must
allow them to be distributed under the same terms as the license of
the original software.

*6. No Discrimination Against Fields of Endeavor*

The license must not restrict anyone from making use of the program
in a specific field of endeavor. For example, it may not restrict
the program from being used in a business, or from being used for
genetic research.

Please remove the problematic |ConvertUTF.{c,h}| files and use |libicu| instead.

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/Unidata/netcdf-c/issues/349, or mute the thread https://github.com/notifications/unsubscribe-auth/AA3P23pcx1GbHm8BdAglMkLAqB-pb4gbks5rU9CngaJpZM4LqhMu.

sebastic commented 7 years ago

Unfortunately the Unicode data license is non-free due to the advertising clause (like BSD-4-Clause).

DennisHeimbigner commented 7 years ago

Interesting. You are aware, I presume that libicu also has this same restriction. Hence we cannot use that either. In fact, my guess is that all utf software suffers from this same problem.

sebastic commented 7 years ago

I was not aware of that icu used the same license terms, since the icu license terms were apparently deemed acceptable for Debian main by the FTP masters (although that's no precedent), it's probably fine to adopt the utf8proc from Julia. If they reject the netcdf upload due to those license terms I'll raise that issue then.

DennisHeimbigner commented 7 years ago

ok

On 2/16/2017 10:35 AM, Bas Couwenberg wrote:

I was not aware of that icu used the same license terms, since the icu license terms were apparently deemed acceptable for Debian main by the FTP masters (although that's no precedent), it's probably fine to adopt the utf8proc from Julia. If they reject the netcdf upload due to those license terms I'll raise that issue then.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/Unidata/netcdf-c/issues/349#issuecomment-280401606, or mute the thread https://github.com/notifications/unsubscribe-auth/AA3P26pTAUzJFEdeoVi8vN8VEbfd7FKBks5rdIjhgaJpZM4LqhMu.

WardF commented 7 years ago

This issue is resolved, closing.

sebastic commented 7 years ago

ncgen/ConvertUTF.c & ncgen/ConvertUTF.h are still included in 4.5.0-rc1, please re-open this issue and remove/replace those files.

WardF commented 7 years ago

@DennisHeimbigner Can the solution you provided for libdispatch/ in #364 also be applied in ncgen/?

DennisHeimbigner commented 7 years ago

I did not remember that this code was being used in ncgen. I will take responsibility for it. Also, odd because it means we are still including the old code?

WardF commented 7 years ago

The old code (convertUTF.c/h) is currently only in ncgen; it was removed from libdispatch and the new code was put in place. I looked at libicu and I'm glad you found this solution as libicu is not practical for our purposes; it is too large, too difficult to deploy, and is an unnecessary dependency.

If you have yet to create a branch to work from, would you ~fork~ branch from v4.5.0-release-branch? If it's too late, no worries, I will make the necessary merges.

DennisHeimbigner commented 7 years ago

Ok, I will fork the release branch. This is going to be harder than I thought. The old convert code was used only to convert utf8 to utf16 for java. The new code apparently has no utf16 support. Since I sincerely doubt that the cdl->java code is being used, I may take the easy way out.

WardF commented 7 years ago

Ok, I will fork the release branch. This is going to be harder than I thought. The old convert code was used only to convert utf8 to utf16 for java. The new code apparently has no utf16 support. Since I sincerely doubt that the cdl->java code is being used, I may take the easy way out.

To make sure I understand, it was only used to convert utf8 to utf16 when having ncgen generate Java code? If this is the case I'd be loathe to rip it out completely as that is very useful, maybe just leave the hooks in and commented out or something. I dug into this a bit and it wouldn't be impossible to write our own converter if need be. But having this functionality removed for the next release candidate wouldn't be a problem. And would give people a chance to speak up if they need/rely on this.

WardF commented 7 years ago

Also, thanks for forking that branch; I've set it up so that anything in that branch can propagate downstream into a release candidate as well as upstream back into master, but the inverse would be messy.

DennisHeimbigner commented 7 years ago

It turns out that I do have utf8 -> utf32 conversion. And converting utf32 -> utf16 can be approximated by truncating the 32bits to 16 bits. I will put in an error for when the approximation fails. In any case, this fix should be "good enough".

WardF commented 7 years ago

@DennisHeimbigner Is this issue ready to be closed out? I think it is but I thought I'd double check before closing it.

WardF commented 7 years ago

Actually, the fix was merged so closing this out, I'll reopen if I hear I need to.