shadow-maint / shadow

Upstream shadow tree
Other
292 stars 228 forks source link

Use liba2i #1049

Open alejandro-colomar opened 2 months ago

alejandro-colomar commented 2 months ago

A lot needs to happen before we can merge this, but let's have it here as a draft. Most likely, won't happen before 2025.

Distros should package liba2i before this happens.

https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=1076091


Revisions:

v1b - Rebase ``` $ git range-diff gh/master..gh/liba2i shadow/master..liba2i 1: cc974c31 = 1: 85174e72 Depend on liba2i 2: 379cdd67 = 2: a13ba662 Add liba2i to the build system ```
v1c - Rebase ``` $ git range-diff master..gh/liba2i shadow/master..liba2i 1: 85174e72 ! 1: b309d24b Depend on liba2i @@ lib/limits.c +#include +#include + - #include "memzero.h" + #include "string/memset/memzero.h" #include "typetraits.h" @@ src/chage.c +#include + #include "defines.h" - #include "memzero.h" #include "prototypes.h" + #include "pwio.h" ## src/check_subid_range.c ## @@ @@ src/faillog.c + #include "defines.h" #include "faillog.h" - #include "memzero.h" + #include "prototypes.h" ## src/free_subid_range.c ## @@ @@ src/passwd.c -#include "atoi/a2i/a2s.h" #include "defines.h" #include "getdef.h" - #include "memzero.h" + #include "nscd.h" ## src/useradd.c ## @@ 2: a13ba662 = 2: 6dcf69be Add liba2i to the build system ```
v1d - Rebase ``` $ git range-diff gh/master..gh/liba2i shadow/master..liba2i 1: b309d24b = 1: 2a8356f3 Depend on liba2i 2: 6dcf69be = 2: bbf5f6b6 Add liba2i to the build system ```
ikerexxe commented 1 month ago

Adding this dependency as optional would probably help to merge the changes more quickly. I say this mostly because this project is used in many distributions, and getting all of them to include this new dependency library is going to take a while.

alejandro-colomar commented 1 month ago

Adding this dependency as optional would probably help to merge the changes more quickly. I say this mostly because this project is used in many distributions, and getting all of them to include this new dependency library is going to take a while.

Hmmm, interesting. I'll consider changing it to be optional. Maybe we could have it optional for a few releases, and after that make it unconditional.

BTW, would you mind doing the packaging for Fedora/RHEL? :-)

jubalh commented 1 month ago

I'll package it for openSUSE. http://www.alejandro-colomar.es/src/alx/alx/lib/liba2i.git/ is the upstream repo, correct?

Do you publish a release tarball somewhere? Is there a way to get notified about new tags/releases?

alejandro-colomar commented 1 month ago

I'll package it for openSUSE.

Lovely; thanks!! :-)

http://www.alejandro-colomar.es/src/alx/alx/lib/liba2i.git/ is the upstream repo, correct?

That's correct, although you will probably prefer to use https://git.kernel.org/pub/scm/libs/liba2i/liba2i.git/.

Both are supported, as far as I'm concerned, but <kernel.org> is more trusted by random users (and probably slightly more stable), I guess.

Do you publish a release tarball somewhere?

I do publish release tarballs here: https://www.alejandro-colomar.es/share/dist/liba2i/

Since the current versions are 0.x, you'll find them under the 0/ subdir: https://www.alejandro-colomar.es/share/dist/liba2i/0/

And have requested the creation of a liba2i/ subdir under https://kernel.org/pub/linux/libs/, but am still waiting for that. I guess now in the summer it may go slower, but probably in a month or so we'll have that site.

Is there a way to get notified about new tags/releases?

If you have a script that can watch a git repository for tags, look for tags of the form liba2i-*.

There's a mailing list: liba2i@lists.linux.dev

I send an email to the list in every release. You could subscribe to it if you want. (But maybe it gets noisier, so you may want to not subscribe to yet another list.)

I can also CC you in those emails. I plan to write a file in the repository with a list of known downstream packages and their websites and maintainers, and CC everyone in that list (unless someone doesn't want to get notified) on every release.

Current links:

Website | Documentation | Contact | Archives | Bugs | Releases | Git | Contributing

ikerexxe commented 1 month ago

BTW, would you mind doing the packaging for Fedora/RHEL? :-)

Yes, I'll take care of it for Fedora, but it'll take me some time because the holiday period is approaching and I am closing a few things before I leave.

alejandro-colomar commented 1 month ago

BTW, would you mind doing the packaging for Fedora/RHEL? :-)

Yes, I'll take care of it for Fedora, but it'll take me some time because the holiday period is approaching and I am closing a few things before I leave.

Sure; no problem. I expect to need to fix a few things once some distros give feedback, so you'll have 0.11 when you come back, more polished.

jubalh commented 1 month ago

I do publish release tarballs here: https://www.alejandro-colomar.es/share/dist/liba2i/

Don' you have LetsEncrypt or something? :)

jubalh commented 1 month ago

Maybe we could have it optional for a few releases, and after that make it unconditional.

Not sure if some distros would want to add a new dependency to their ring0/base systems though.

alejandro-colomar commented 1 month ago

I do publish release tarballs here: https://www.alejandro-colomar.es/share/dist/liba2i/

Don' you have LetsEncrypt or something? :)

Nope, and don't want to: https://www.alejandro-colomar.es/ssl :)

You can access it via http, though.

alejandro-colomar commented 1 month ago

Maybe we could have it optional for a few releases, and after that make it unconditional.

Not sure if some distros would want to add a new dependency to their ring0/base systems though.

It's a quite small lib. I hope they don't mind so much. I was talking to Void Linux, and they were receptive. Let's see how others receive it.

alejandro-colomar commented 1 month ago

@jubalh

Currently, the build system of liba2i behaves like this:

make CFLAGS=...  # override default CFLAGS
make EXTRA_CFLAGS=...  # append to default CFLAGS

Is that behavior ok? Should I change that? What do downstream distros expect? How does one normally append to the default CFLAGS, and how does one normally overwrite the default CFLAGS?

thesamesam commented 1 month ago

Adding this dependency as optional would probably help to merge the changes more quickly. I say this mostly because this project is used in many distributions, and getting all of them to include this new dependency library is going to take a while.

I gave extensive feedback on it via email several months ago and I don't feel it was fully reflected upon. All the little things like not being able to use HTTPS and these silly quirks do not help. If it's on git.kernel.org, that helps now, I guess, although using it to "launder trust" is a bit meh.

I don't really want to see shadow adopt another library, especially one which has received little review. I also think the bar should be particularly high for a critical package like shadow. I don't see what arguments have been made in favour of the move, let alone strong ones.

I already think there's been an uncomfortably-large amount of churn in the input handling in shadow, often inventing new functions which.. are now being moved out into an external library anyway? Not only does this introduce opportunities for bugs to creep in, it makes it challenging for others to interact with the shadow codebase. In other communities, this is called NIH syndrome. There is an implicit cost to inventing your own functions, wrappers, and then a whole library.

Further, when looking at the implementations in liba2i, there's a not-insignificant amount of indirection which makes reading it challenging.

I don't have a vote, but if I did, I'd cast it against at present.

EDIT: And the various preliminary concerns I've had to point out again in https://github.com/void-linux/void-packages/issues/51261 are not helping.

jubalh commented 1 month ago

Nope, and don't want to: https://www.alejandro-colomar.es/ssl :)

That's unsual ;)

I gave a quick go at packaging and noticed that the build isn't straight forward. I wonder why no standard build system and default installation way was chosen.

It would be good to keep the "unusualness" to a minimum.

alejandro-colomar commented 1 month ago

@thesamesam

Hi Sam,

I gave extensive feedback on it via email several months ago and I don't feel it was fully reflected upon.

I have reflected upon what I've been able to.

Of course, the larger request of "please drop your entire build system and use my favourite one^W^W^W autotools", I haven't been able to address, because I do not know how to use autotools.

Even if you were so kind of providing patches for doing that change, I doubt I would be able to maintain my own project if the build system is autotools-based, so we'd have an important problem.

All the little things like not being able to use HTTPS and these silly quirks do not help.

It's my server; my rules. I'm sorry if we disagree on how my server should work.

If it's on git.kernel.org, that helps now, I guess,

Since we clearly have a disagreement on my server, I provide an impartial server for commodity.

I don't like github, nor any other paid services (github is paid; if we account for the data they steal from you legally, including source code to train their AI; it's more expensive that a paid git server (if those exist at all), actually).

I wouldn't mind using sourcehut, but I find cgit more comfortable.

Since I have a kernel.org account, and they allow me to do this, I provide it there.

although using it to "launder trust" is a bit meh.

I know you're worried after the XZ events. I don't know how I can make you trust me, since of course a malicious actor would also try the same goal. Also, since Jia Tan introduced his stuff via a build system, and I'm also doing unusual stuff with build systems, I guess it makes sense to be suspicious.

You can trace back all of my programming history back to when I was learning to program in StackOverflow. This is my account, if you want to investigate: https://stackoverflow.com/users/6872717/alx-recommends-codidact

That's my first online pressence programming, so if you search there for my questions in chronological order, it will hopefully make sense to you and be consistent.

I also have no problem to give you personal information about which school I went to in Spain, or where I live, if you want to do a full background check. If you need anything, please ask in private.

Also regarding trust:

I think my transparency in patches is actually unprecedented. I provide range diffs from the first iteration of a patch until it's merged, to let anyone reviewing see that I didn't introduce any sneaky changes during revision. Anyone is free to audit any of my PRs to find out any problems.

I also sign all of my commits, so that I can stand by them when I see them. That also adds some other feature: I can't repudiate a commit of mine (unless I repudiate my own signing key, via a revocation certificate, but that'd need to be justified). My reputation depends on my contributions, so I sign them (commits, and emails).

And after your feedback about it, I have increased my transparency even further, as you can see in the commit logs, when cherry-picking commits between projects. If you find anything I do to not be transparent enough, please let me know, and I'll try to improve it.

I don't really want to see shadow adopt another library, especially one which has received little review.

Feel free to review it as much as you want. I can wait 5 years if that's what you need. But you ghosted me some months ago, which I think is a bit rude from you.

I still understand it in context, because I could have just been the next Jia Tan trying to convince you of something, so I understand you not wanting to talk too much with someone asking for feedback about packaging his stuff in the middle of the XZ aftermath. But it still sucks.

Please let me know what problems you find in the project, and I will kindly try to solve them, if I can (this excludes, at the moment, switching to autotools, but I can try to make my GNUmakefile-based build system behave very much like any other standard build system).

Regarding the library, I provide a comprehensive set of tests, so if you try those tests, or can think of anything I'm missing in my tests, please say so.

I also think the bar should be particularly high for a critical package like shadow.

Agree.

I don't see what arguments have been made in favour of the move, let alone strong ones.

alx@debian:~/src/alx/liba2i/master$ find share/tests/ -type f | xargs wc -l | tail -n1
  9638 total

So, I think it makes sense to add the dependency. I'm ready to wait as long as you need to audit the project.

I already think there's been an uncomfortably-large amount of churn in the input handling in shadow, often inventing new functions

That churn has found and fixed many bugs. Not only in shadow, BTW. I even found and fixed a bug in the design of strtoi(3) in NetBSD (and consequently in libbsd too) while designing and testing these APIs, which has fixed uncountable bugs in NetBSD and Debian, and probably also in other OSes.

I agree churn can be a bad thing, but in this case I think it's quite justified.

Moreover, in some cases I've taken almost a year, not just some months, to discuss and iterate over some of the changes. That has helped make them quite good before merging.

It's not at all like one has used sed(1) to do a global churny change possibly introducing several bugs in the process.

which.. are now being moved out into an external library anyway?

That sets in stone those APIs, which reduces the maintenance of that part. It means no more churn to those APIs, which should be a good thing?

Not only does this introduce opportunities for bugs to creep in,

Sure; but I have a comprehensive set of tests in that library, so it would be harder to introduce bugs there than in shadow. In fact, I had a bug in the shadow version, which had already been fixed in liba2i (see https://github.com/shadow-maint/shadow/commit/63297e836d000db7928b02cf9608796346705611).

it makes it challenging for others to interact with the shadow codebase.

Really? Do you find it easier to interact with strtol(3)? I don't even know how to use it correctly.

And clearly, the authors of shadow didn't either, from what we've seen.

I understand some friction when using a new API, because you need to learn it, but in fact, shadow already had APIs like the ones provided in liba2i. For example, str2sl(3) is the same exact function that shadow has had for a very long time: it was called getlong() in shadow (with a trivial difference: getlong() reported errors with false, while str2sl(3) reports them with -1). I have only standardized the functions, gave them a consistent and unique name, and added macros that allowed to use these APIs with typedefs.

In other communities, this is called NIH syndrome.

I know that syndrome. But it is only a valid term when the API (or some usable replacement) is actually invented.

In this case, it was actually Not Invented Anywhere.

Let me list the invented list:

The fact that I provide strtoi(3) in my library (as a2i_strtoi()), is not a chance for adding bugs, but actually a bug fix. By writing that function, and developing comprehensive tests for it, and running those tests against both my implementation and libbsd's one, I found the design issue in NetBSD's strtoi(3).

alx@debian:~/src/bsd/netbsd/trunk$ git show f1987b28c87ee37690268b93ba1b11ed8d571305
commit f1987b28c87ee37690268b93ba1b11ed8d571305
Author: christos <christos@NetBSD.org>
Date:   Sat Jan 20 16:13:39 2024 +0000

    PR/57828: Alejandro Colomar: Prioritize test for ERANGE before testing for
    fully consuming the string. Adjust strtonum(3) to behave as before. Document
    the order of the tests and sync the man pages (I should really autogenerate
    one of the two man pages...)

https://gnats.netbsd.org/cgi-bin/query-pr-single.pl?number=57828

There is an implicit cost to inventing your own functions, wrappers,

And a reward. Do you suggest to continue using strtol(3)? If so, I'll let you suggest how it should be used correctly in shadow. I don't, myself.

and then a whole library.

I think the library is the smallest change. Yeah, it has the build system, and packaging cost, but then you get the reward in the form of being allowed to call those APIs from any other projects. Since the only dependency of liba2i is libc, it should be usable almost everywhere.

The library has the lowest cost, and the highest reward.

It was indeed when writing my library that the strtoi(3) bug was found. If I had kept it just as a shadow internal library, we would have never found that, and you could be arguing that it was indeed a NIH syndrome by not using libbsd's strtoi(3). Now we know that libbsd's strtoi(3) wasn't as good as it sounded (and anyway, it's not good for typedefs).

Further, when looking at the implementations in liba2i, there's a not-insignificant amount of indirection which makes reading it challenging.

I have significantly reduced that since you last looked at it, I think.

$ grepc a2s include/
include/a2i/a2i/a2s.h:#define a2s(TYPE, n, s, ...)                                                  \
(                                                                             \
    _Generic((void (*)(TYPE, typeof(s))) 0,                               \
        void (*)(signed char,        const char *):  a2shh_c,         \
        void (*)(signed char,        const void *):  a2shh_c,         \
        void (*)(signed char,        char *):        a2shh_nc,        \
        void (*)(signed char,        void *):        a2shh_nc,        \
        void (*)(short,              const char *):  a2sh_c,          \
        void (*)(short,              const void *):  a2sh_c,          \
        void (*)(short,              char *):        a2sh_nc,         \
        void (*)(short,              void *):        a2sh_nc,         \
        void (*)(int,                const char *):  a2si_c,          \
        void (*)(int,                const void *):  a2si_c,          \
        void (*)(int,                char *):        a2si_nc,         \
        void (*)(int,                void *):        a2si_nc,         \
        void (*)(long,               const char *):  a2sl_c,          \
        void (*)(long,               const void *):  a2sl_c,          \
        void (*)(long,               char *):        a2sl_nc,         \
        void (*)(long,               void *):        a2sl_nc,         \
        void (*)(long long,          const char *):  a2sll_c,         \
        void (*)(long long,          const void *):  a2sll_c,         \
        void (*)(long long,          char *):        a2sll_nc,        \
        void (*)(long long,          void *):        a2sll_nc         \
    )(n, s, __VA_ARGS__)                                                  \
)
$ grepc -B1 a2shh_c lib/src/
lib/src/a2i/a2i/a2s_.c-A2I_ATTR_ALIAS("a2shh_nc")
lib/src/a2i/a2i/a2s_.c:int a2shh_c(signed char *restrict n, const char *s,
    const char **a2i_nullable restrict endp, int base,
    signed char min, signed char max);
$ grepc -tfd a2shh_nc lib/src/
lib/src/a2i/a2i/a2s_.h:inline int
a2shh_nc(signed char *restrict n, char *s,
    char **a2i_nullable restrict endp, int base,
    signed char min, signed char max)
{
    int  status;

    *n = a2i_strtoi(s, endp, base, min, max, &status);
    if (status != 0)
        errno = status;

    return -!!status;
}

And then you could stop there, since a2i_strtoi() is exactly strtoi(3), which is a relatively well-known API. But if you want to check it in my own implementation:

$ grepc -tfd a2i_strtoi lib/src/
lib/src/a2i/strtoi/strtoi/strtoi.h:inline intmax_t
a2i_strtoi(char *s,
    char **a2i_nullable restrict endp, int base,
    intmax_t min, intmax_t max, int *a2i_nullable restrict status)
{
    int        errno_saved, st;
    char       *e;
    intmax_t   n;

    if (endp == NULL)
        endp = &e;
    if (status == NULL)
        status = &st;

    if (base != 0 && (base < 2 || base > 36)) {
        *status = EINVAL;
        return MAX(min, MIN(max, 0));
    }

    errno_saved = errno;
    errno = 0;

    n = strtoimax(s, endp, base);

    if (*endp == s)
        *status = ECANCELED;
    else if (errno == ERANGE || n < min || n > max)
        *status = ERANGE;
    else if (**endp != '\0')
        *status = ENOTSUP;
    else
        *status = 0;

    errno = errno_saved;

    return MAX(min, MIN(max, n));
}

I don't have a vote, but if I did, I'd cast it against at present.

Understood. If you are interested in making an effort in auditing and packaging liba2i, I'll respect that vote and wait for you to be happy with the library.

EDIT: And the various preliminary concerns I've had to point out again in void-linux/void-packages#51261 are not helping.

I'll try to reply to those there.

And some last words: after the XZ events, I've noticed a difference in your attitude to me, which I understand, but doesn't feel nice. I'd like to ask you to please be nice with me, as I never did anything to you (or if I did, I dind't notice, I'm sorry if I did). If you're skeptic about trusting me, that's ok, but please be nice, and let me know if there anything I can improve.

I'm certainly not going to be an expert in autotools tomorrow, though, so please don't ask me to be one.

Have a lovely day! Alex

alejandro-colomar commented 1 month ago

Nope, and don't want to: https://www.alejandro-colomar.es/ssl :)

That's unsual ;)

I gave a quick go at packaging and noticed that the build isn't straight forward. I wonder why no standard build system and default installation way was chosen.

I just don't know what's "usual" behavior, especially, since it's not something written in any standard. When I learned to write Makefiles, I learned from the GNU make documentation. If something is not written there, I didn't learn it.

I'm now learning what distros expect, after talking to some packagers that gave me feedback about the library. I'm trying to adjust the build system to that feedback.

It would be good to keep the "unusualness" to a minimum.

That's my intention. In 1.0-rc2, liba2i is significantly more "usual". There are still a few bits that I haven't been able to fix, but now it should be more usable.

hallyn commented 1 month ago

Do I understand correctly that liba2i is only licensed under LGPG 3.0-or-later ? I see the exception, but I'm not quite clear on whether that limits the license that can be applied to binaries and libraries using shadow.

alejandro-colomar commented 1 month ago

Do I understand correctly that liba2i is only licensed under LGPG 3.0-or-later ? I see the exception, but I'm not quite clear on whether that limits the license that can be applied to binaries and libraries using shadow.

The source code and build system is LGPL-3.0-only WITH LGPL-3.0-linking-exception:

alx@debian:~/src/alx/liba2i/contrib$ grep -rh SPDX-License include/ | sort | uniq -c
     11 // SPDX-License-Identifier: LGPL-3.0-only WITH LGPL-3.0-linking-exception
alx@debian:~/src/alx/liba2i/contrib$ grep -rh SPDX-License lib/src/ | sort | uniq -c
     28 // SPDX-License-Identifier: LGPL-3.0-only WITH LGPL-3.0-linking-exception
alx@debian:~/src/alx/liba2i/contrib$ grep -rh SPDX-License share/mk/ | sort | uniq -c
    154 # SPDX-License-Identifier: LGPL-3.0-only WITH LGPL-3.0-linking-exception

For some reason, I accidentally slipped a different license version in the tests (the -or-later one). I'll make that consistent.

alx@debian:~/src/alx/liba2i/contrib$ grep -rh SPDX-License share/tests/ | sort | uniq -c
     26 # SPDX-License-Identifier: LGPL-3.0-or-later WITH LGPL-3.0-linking-exception
     26 // SPDX-License-Identifier: LGPL-3.0-only WITH LGPL-3.0-linking-exception

And the man pages have the Linux-man-pages-copyleft license:

alx@debian:~/src/alx/liba2i/contrib$ grep -rh SPDX-License man/ | sort | uniq -c
      6 .\" SPDX-License-Identifier: Linux-man-pages-copyleft

So, you shouldn't have any license problems linking to it:

$ cat LICENSES/LGPL-3.0-linking-exception.txt 
As a special exception to the GNU Lesser General Public License version 3
("LGPL3"), the copyright holders of this Library give you permission to
convey to a third party a Combined Work that links statically or dynamically
to this Library without providing any Minimal Corresponding Source or
Minimal Application Code as set out in 4d or providing the installation
information set out in section 4e, provided that you comply with the other
provisions of LGPL3 and provided that you meet, for the Application the
terms and conditions of the license(s) which apply to the Application.

Except as stated in this special exception, the provisions of LGPL3 will
continue to comply in full to this Library. If you modify this Library, you
may apply this exception to your version of this Library, but you are not
obliged to do so. If you do not wish to do so, delete this exception
statement from your version. This exception does not (and cannot) modify any
license terms which apply to the Application, with which you must still
comply.

AFAIU, that exception means that you can link to it, and the LGPL won't eat your project. But IANAL.

alejandro-colomar commented 1 month ago

If anyone else can clarify if that license allows linking in shadow, please manifest yourselves. Cc: @g-branden-robinson

hallyn commented 1 month ago

$ cat LICENSES/LGPL-3.0-linking-exception.txt As a special exception to the GNU Lesser General Public License version 3 ("LGPL3"), the copyright holders of this Library give you permission to convey to a third party a Combined Work that links statically or dynamically to this Library without providing any Minimal Corresponding Source or Minimal Application Code as set out in 4d or providing the installation information set out in section 4e, provided that you comply with the other provisions of LGPL3 and provided that you meet, for the Application the terms and conditions of the license(s) which apply to the Application.

What about 4a, c and d?

alejandro-colomar commented 1 month ago

$ cat LICENSES/LGPL-3.0-linking-exception.txt As a special exception to the GNU Lesser General Public License version 3 ("LGPL3"), the copyright holders of this Library give you permission to convey to a third party a Combined Work that links statically or dynamically to this Library without providing any Minimal Corresponding Source or Minimal Application Code as set out in 4d or providing the installation information set out in section 4e, provided that you comply with the other provisions of LGPL3 and provided that you meet, for the Application the terms and conditions of the license(s) which apply to the Application. What about 4a, c and d?

4a) I consider the relevant entry in the build system to be "prominent notice". Maybe we can have a one-or-two-liner comment above it to be more prominent, and more of a notice.

4c) It starts saying For a Combined Work that displays copyright notices during execution,. AFAIK, shadow utils don't display copyright notices during execution, so it wouldn't apply.

4d is explicitly excluded in the exception.

alejandro-colomar commented 1 month ago

4a) I consider the relevant entry in the build system to be "prominent notice". Maybe we can have a one-or-two-liner comment above it to be more prominent, and more of a notice.

And maybe add a one-liner to COPYING.

alejandro-colomar commented 1 month ago

$ cat LICENSES/LGPL-3.0-linking-exception.txt As a special exception to the GNU Lesser General Public License version 3 ("LGPL3"), the copyright holders of this Library give you permission to convey to a third party a Combined Work that links statically or dynamically to this Library without providing any Minimal Corresponding Source or Minimal Application Code as set out in 4d or providing the installation information set out in section 4e, provided that you comply with the other provisions of LGPL3 and provided that you meet, for the Application the terms and conditions of the license(s) which apply to the Application. What about 4a, c and d?

For example, glibc is LGPL-2.1, which also has similar text in 6 (2nd paragraph). If shadow can link to glibc, it should be able to link to liba2i.

hallyn commented 1 month ago

$ cat LICENSES/LGPL-3.0-linking-exception.txt As a special exception to the GNU Lesser General Public License version 3 ("LGPL3"), the copyright holders of this Library give you permission to convey to a third party a Combined Work that links statically or dynamically to this Library without providing any Minimal Corresponding Source or Minimal Application Code as set out in 4d or providing the installation information set out in section 4e, provided that you comply with the other provisions of LGPL3 and provided that you meet, for the Application the terms and conditions of the license(s) which apply to the Application. What about 4a, c and d?

4a) I consider the relevant entry in the build system to be "prominent notice". Maybe we can have a one-or-two-liner comment above it.

4c) It starts saying For a Combined Work that displays copyright notices during execution,. AFAIK, shadow utils don't display copyright notices during execution, so it wouldn't apply.

4d is explicitly excluded in the exception.

Thanks. Sounds like it should be manageable.

sgn commented 1 month ago

It's a quite small lib. I hope they don't mind so much. I was talking to Void Linux, and they were receptive. Let's see how others receive it.

I only tried to see how to build it, no where I said receptive or not :)

thesamesam commented 1 month ago

@thesamesam

Hi Sam,

I gave extensive feedback on it via email several months ago and I don't feel it was fully reflected upon.

I have reflected upon what I've been able to.

Of course, the larger request of "please drop your entire build system and use my favourite one^W^W^W autotools", I haven't been able to address, because I do not know how to use autotools.

I did point out several tangible issues, as well as classes of problem. I maintain that the correct way to fix that is to use autotools or meson. If you want to fix that another way, if you really must, go ahead, but that hasn't happened, so what should I conclude?

Not only that, we had the same debate already in https://github.com/shadow-maint/shadow/issues/795. I'm fatigued from having to keep discussing and explaining the fundamentals of a topic to you when there's a dangling threat of "or there'll be loads of work for you when I implement it, so you must fight against it now if you don't want it". That is of course distinct from just teaching you or anyone about how things work or are expected to behave.

Even if you were so kind of providing patches for doing that change, I doubt I would be able to maintain my own project if the build system is autotools-based, so we'd have an important problem.

All the little things like not being able to use HTTPS and these silly quirks do not help.

It's my server; my rules. I'm sorry if we disagree on how my server should work.

I don't care what you do privately. I'm saying that if you are going to propose a project for use by others in a critical package, I will then have an opinion on it, as it affects me. Please don't reduce this to "sam imposing personal preferences on your private machine", as that's not what is happening.

This also then brings us into my general point of appearances mattering. These things together are not confidence inspiring.

If it's on git.kernel.org, that helps now, I guess,

Since we clearly have a disagreement on my server, I provide an impartial server for commodity.

I don't like github, nor any other paid services (github is paid; if we account for the data they steal from you legally, including source code to train their AI; it's more expensive that a paid git server (if those exist at all), actually).

Sure.

I wouldn't mind using sourcehut, but I find cgit more comfortable.

Since I have a kernel.org account, and they allow me to do this, I provide it there.

although using it to "launder trust" is a bit meh.

I know you're worried after the XZ events. I don't know how I can make you trust me, since of course a malicious actor would also try the same goal. Also, since Jia Tan introduced his stuff via a build system, and I'm also doing unusual stuff with build systems, I guess it makes sense to be suspicious.

Some suggestions:

You can trace back all of my programming history back to when I was learning to program in StackOverflow. This is my account, if you want to investigate: stackoverflow.com/users/6872717/alx-recommends-codidact

That's my first online pressence programming, so if you search there for my questions in chronological order, it will hopefully make sense to you and be consistent.

I also have no problem to give you personal information about which school I went to in Spain, or where I live, if you want to do a full background check. If you need anything, please ask in private.

Also regarding trust:

I think my transparency in patches is actually unprecedented. I provide range diffs from the first iteration of a patch until it's merged, to let anyone reviewing see that I didn't introduce any sneaky changes during revision. Anyone is free to audit any of my PRs to find out any problems.

I also sign all of my commits, so that I can stand by them when I see them. That also adds some other feature: I can't repudiate a commit of mine (unless I repudiate my own signing key, via a revocation certificate, but that'd need to be justified). My reputation depends on my contributions, so I sign them (commits, and emails).

And after your feedback about it, I have increased my transparency even further, as you can see in the commit logs, when cherry-picking commits between projects. If you find anything I do to not be transparent enough, please let me know, and I'll try to improve it.

Thanks, I appreciate that.

But can I offer another perspective? Looking at https://github.com/shadow-maint/shadow/pulls?q=sort%3Aupdated-desc+is%3Apr+is%3Aopen+author%3Aalejandro-colomar, many of the PRs are huge.

Considering some examples quickly:

To me, it feels like a pattern of making huge PRs which are incrementally rewriting shadow into an Alex-DSL of C.

Do you see why I'm both concerned about the churn, especially with these examples I mention above in mind where different APIs are being ported from/to, but also the new APIs being added aren't stable or necessarily well-thought-through if you're deleting them shortly after?

(I also think that being transparent isn't necessarily the same as being verbose.)

Then, amongst all this, there's issues like https://github.com/shadow-maint/shadow/pull/624. Bugs happen and we all make errors. But do you see why I'm worried about huge amounts of churn making it pretty hard to review and notice substantive changes which do need discussion?

I don't really want to see shadow adopt another library, especially one which has received little review.

Feel free to review it as much as you want. I can wait 5 years if that's what you need. But you ghosted me some months ago, which I think is a bit rude from you.

No, I didn't ghost you. If you think I've dropped the ball on something, you can ping me (which I tell everybody, as I get a lot of email and notifications). I did, however, in our correspondence say that I didn't have time to keep reiterating the same points.

I still understand it in context, because I could have just been the next Jia Tan trying to convince you of something, so I understand you not wanting to talk too much with someone asking for feedback about packaging his stuff in the middle of the XZ aftermath. But it still sucks.

Please let me know what problems you find in the project, and I will kindly try to solve them, if I can (this excludes, at the moment, switching to autotools, but I can try to make my GNUmakefile-based build system behave very much like any other standard build system).

But I'm not convinced by this, because if you could make it act correctly, you would've by now?

There's an element of arrogance in expecting the world to have to explain from first-principles why existing solutions are the way they are, and to rediscover every problem they solve. I understand you don't see it that way, but it's how I perceive it at this point given the insistence on sticking with it.

If you could make it perform identical to a correct build system, or pretty close, then all the discussion on that Void PR wouldn't have even occurred.

Regarding the library, I provide a comprehensive set of tests, so if you try those tests, or can think of anything I'm missing in my tests, please say so.

I also think the bar should be particularly high for a critical package like shadow.

Agree.

I don't see what arguments have been made in favour of the move, let alone strong ones.

* Comprehensive set of tests for the API (unlike in shadow, where I only added some basic tests; I don't feel like adding 10k lines of tests to shadow just for a small set of functions; and even then, I'm not sure how I could test some things with the current build system of shadow.git).
alx@debian:~/src/alx/liba2i/master$ find share/tests/ -type f | xargs wc -l | tail -n1
  9638 total
* I've found bugs in other projects that would be fixed by using the same interfaces.
  This point, admittedly, is only justification for adding the library to distros for using in other projects, but not shadow, so we could keep shadow with its own implementation, and use liba2i for other projects.  But doing so would have its own problems:

  * We'd need to backport any bug fixes applied to liba2i to shadow.

* Once you finish auditing the liba2i project, you'll be able to use it in other projects.

So, I think it makes sense to add the dependency. I'm ready to wait as long as you need to audit the project.

I already think there's been an uncomfortably-large amount of churn in the input handling in shadow, often inventing new functions

That churn has found and fixed many bugs. Not only in shadow, BTW. I even found and fixed a bug in the design of strtoi(3) in NetBSD (and consequently in libbsd too) while designing and testing these APIs, which has fixed uncountable bugs in NetBSD and Debian, and probably also in other OSes.

I agree churn can be a bad thing, but in this case I think it's quite justified.

Moreover, in some cases I've taken almost a year, not just some months, to discuss and iterate over some of the changes. That has helped make them quite good before merging.

It's not at all like one has used sed(1) to do a global churny change possibly introducing several bugs in the process.

I propose an alternative to this below, fwiw.

which.. are now being moved out into an external library anyway?

That sets in stone those APIs, which reduces the maintenance of that part. It means no more churn to those APIs, which should be a good thing?

I'm glad to hear that, but it's a great shame that there was so much churn in shadow until we got to that stabilisation point. shadow wasn't a good place to be doing that in.

Not only does this introduce opportunities for bugs to creep in,

Sure; but I have a comprehensive set of tests in that library, so it would be harder to introduce bugs there than in shadow. In fact, I had a bug in the shadow version, which had already been fixed in liba2i (see 63297e8).

I also do really believe that various changes like 3c09e40a1fe9ccff4cf3d6a1e65c2bc3eba33b1e are a good thing. But they're mixed in with other changes.

it makes it challenging for others to interact with the shadow codebase.

Really? Do you find it easier to interact with strtol(3)? I don't even know how to use it correctly.

(I address this point elsewhere.)

And clearly, the authors of shadow didn't either, from what we've seen.

I understand some friction when using a new API, because you need to learn it, but in fact, shadow already had APIs like the ones provided in liba2i. For example, str2sl(3) is the same exact function that shadow has had for a very long time: it was called getlong() in shadow (with a trivial difference: getlong() reported errors with false, while str2sl(3) reports them with -1). I have only standardized the functions, gave them a consistent and unique name, and added macros that allowed to use these APIs with typedefs.

(I address this point elsewhere, but I'll add that it's not necessarily a reason to change it further or introduce more.)

In other communities, this is called NIH syndrome.

I know that syndrome. But it is only a valid term when the API (or some usable replacement) is actually invented.

In this case, it was actually Not Invented Anywhere.

I don't see any mention of the rationale in this PR. It'd be good to state that. If you have a compelling case for adding it, make it?

Let me list the invented list:

* atol(3)
  UB Russian Roulette (this is actually fault of ISO C, not the API itself).  I guess we can skip it.

* strtol(3)
  It is unusable (directly).  If you want an example, just look at shadow a few versions ago before I removed all of its uses.  It had plenty of mishandled errors.  The same in groff, where I've fixed plenty of mishandled errors (and I stopped fixing more in groff because it was a never-ending job;  CC: @g-branden-robinson).  I'm not the first one claiming that, and proof is that several projects came with their own "better" APIs wrapping strtol(3) (I'm referring to the BSDs).

* OpenBSD's strtonum(3)
  It is a specialized function that is not a replacement for all strtol(3) uses; only some.  We can discard it.

* NetBSD's strtoi(3)
  This is actually a good interface.  It fixes most of the problems in strtol(3).  However, it's not good for typedefs, so I had to develop my own wrapper around it.  a2i(3) is only a thin wrapper around strtoi(3).  (I had to also provide strtoi(3) in my library, to not have libbsd as a dependency of liba2i.)

I'm not saying that atol and strtol are good APIs.

What I'm saying is that:

I'm trying to say that I don't want to litigate the state of the C standard library. I'm saying that there should be a great deal of care in modifying a critical project, and in particular, due regard must be given to:

With particular regard to review, I'll note that the kernel uses coccinelle to great effect here. It allows reproducing large diffs, it allows clearly seeing the intent and effect of a change, and it also allows debate & review of the change itself to be done on a concrete description of the transformation.

Explaining how to reproduce your changes and how you made the transformations would be good.

The fact that I provide strtoi(3) in my library (as a2i_strtoi()), is not a chance for adding bugs, but actually a bug fix. By writing that function, and developing comprehensive tests for it, and running those tests against both my implementation and libbsd's one, I found the design issue in NetBSD's strtoi(3).

I'm not contesting this.

alx@debian:~/src/bsd/netbsd/trunk$ git show f1987b28c87ee37690268b93ba1b11ed8d571305
commit f1987b28c87ee37690268b93ba1b11ed8d571305
Author: christos <christos@NetBSD.org>
Date:   Sat Jan 20 16:13:39 2024 +0000

    PR/57828: Alejandro Colomar: Prioritize test for ERANGE before testing for
    fully consuming the string. Adjust strtonum(3) to behave as before. Document
    the order of the tests and sync the man pages (I should really autogenerate
    one of the two man pages...)

gnats.netbsd.org/cgi-bin/query-pr-single.pl?number=57828

There is an implicit cost to inventing your own functions, wrappers,

And a reward. Do you suggest to continue using strtol(3)? If so, I'll let you suggest how it should be used correctly in shadow. I don't, myself.

No, but that doesn't in any way negate my arguments here?

and then a whole library.

I think the library is the smallest change. Yeah, it has the build system, and packaging cost, but then you get the reward in the form of being allowed to call those APIs from any other projects. Since the only dependency of liba2i is libc, it should be usable almost everywhere.

But how stable are they? I get impression they aren't at all.

The library has the lowest cost, and the highest reward.

It was indeed when writing my library that the strtoi(3) bug was found. If I had kept it just as a shadow internal library, we would have never found that, and you could be arguing that it was indeed a NIH syndrome by not using libbsd's strtoi(3). Now we know that libbsd's strtoi(3) wasn't as good as it sounded (and anyway, it's not good for typedefs).

I don't really know what to say there, there is value in rewriting things to understand how they work. It doesn't mean you should then make everything use that.

Further, when looking at the implementations in liba2i, there's a not-insignificant amount of indirection which makes reading it challenging.

I have significantly reduced that since you last looked at it, I think.

That's good news, thank you.

$ grepc a2s include/
include/a2i/a2i/a2s.h:#define a2s(TYPE, n, s, ...)                                                  \
(                                                                             \
  _Generic((void (*)(TYPE, typeof(s))) 0,                               \
      void (*)(signed char,        const char *):  a2shh_c,         \
      void (*)(signed char,        const void *):  a2shh_c,         \
      void (*)(signed char,        char *):        a2shh_nc,        \
      void (*)(signed char,        void *):        a2shh_nc,        \
      void (*)(short,              const char *):  a2sh_c,          \
      void (*)(short,              const void *):  a2sh_c,          \
      void (*)(short,              char *):        a2sh_nc,         \
      void (*)(short,              void *):        a2sh_nc,         \
      void (*)(int,                const char *):  a2si_c,          \
      void (*)(int,                const void *):  a2si_c,          \
      void (*)(int,                char *):        a2si_nc,         \
      void (*)(int,                void *):        a2si_nc,         \
      void (*)(long,               const char *):  a2sl_c,          \
      void (*)(long,               const void *):  a2sl_c,          \
      void (*)(long,               char *):        a2sl_nc,         \
      void (*)(long,               void *):        a2sl_nc,         \
      void (*)(long long,          const char *):  a2sll_c,         \
      void (*)(long long,          const void *):  a2sll_c,         \
      void (*)(long long,          char *):        a2sll_nc,        \
      void (*)(long long,          void *):        a2sll_nc         \
  )(n, s, __VA_ARGS__)                                                  \
)
$ grepc -B1 a2shh_c lib/src/
lib/src/a2i/a2i/a2s_.c-A2I_ATTR_ALIAS("a2shh_nc")
lib/src/a2i/a2i/a2s_.c:int a2shh_c(signed char *restrict n, const char *s,
    const char **a2i_nullable restrict endp, int base,
    signed char min, signed char max);
$ grepc -tfd a2shh_nc lib/src/
lib/src/a2i/a2i/a2s_.h:inline int
a2shh_nc(signed char *restrict n, char *s,
    char **a2i_nullable restrict endp, int base,
    signed char min, signed char max)
{
  int  status;

  *n = a2i_strtoi(s, endp, base, min, max, &status);
  if (status != 0)
      errno = status;

  return -!!status;
}

And then you could stop there, since a2i_strtoi() is exactly strtoi(3), which is a relatively well-known API. But if you want to check it in my own implementation:

$ grepc -tfd a2i_strtoi lib/src/
lib/src/a2i/strtoi/strtoi/strtoi.h:inline intmax_t
a2i_strtoi(char *s,
    char **a2i_nullable restrict endp, int base,
    intmax_t min, intmax_t max, int *a2i_nullable restrict status)
{
  int        errno_saved, st;
  char       *e;
  intmax_t   n;

  if (endp == NULL)
      endp = &e;
  if (status == NULL)
      status = &st;

  if (base != 0 && (base < 2 || base > 36)) {
      *status = EINVAL;
      return MAX(min, MIN(max, 0));
  }

  errno_saved = errno;
  errno = 0;

  n = strtoimax(s, endp, base);

  if (*endp == s)
      *status = ECANCELED;
  else if (errno == ERANGE || n < min || n > max)
      *status = ERANGE;
  else if (**endp != '\0')
      *status = ENOTSUP;
  else
      *status = 0;

  errno = errno_saved;

  return MAX(min, MIN(max, n));
}

I don't have a vote, but if I did, I'd cast it against at present.

Understood. If you are interested in making an effort in auditing and packaging liba2i, I'll respect that vote and wait for you to be happy with the library.

EDIT: And the various preliminary concerns I've had to point out again in void-linux/void-packages#51261 are not helping.

I'll try to reply to those there.

Thank you.

And some last words: after the XZ events, I've noticed a difference in your attitude to me, which I understand, but doesn't feel nice. I'd like to ask you to please be nice with me, as I never did anything to you (or if I did, I dind't notice, I'm sorry if I did). If you're skeptic about trusting me, that's ok, but please be nice, and let me know if there anything I can improve.

I think I've been frustrated by trying to review massive diffs between releases in shadow and man-pages. I'm sorry if I've been rude to you and it wasn't intentional.

I'm certainly not going to be an expert in autotools tomorrow, though, so please don't ask me to be one.

I'm not asking you to be, to be clear.

alejandro-colomar commented 1 month ago

I did point out several tangible issues, as well as classes of problem. I maintain that the correct way to fix that is to use autotools or meson. If you want to fix that another way, if you really must, go ahead, but that hasn't happened,

I needed to know what my bugs were exactly. Now that I've had many detailed reports, I'm addressing them.

I wouldn't mind using sourcehut, but I find cgit more comfortable. Since I have a kernel.org account, and they allow me to do this, I provide it there.

although using it to "launder trust" is a bit meh.

I know you're worried after the XZ events. I don't know how I can make you trust me, since of course a malicious actor would also try the same goal. Also, since Jia Tan introduced his stuff via a build system, and I'm also doing unusual stuff with build systems, I guess it makes sense to be suspicious.

Some suggestions:

* First, be entirely transparent about your plans. This PR doesn't explain what liba2i is to someone who is unfamiliar. It doesn't explain why it's needed.

Agree. I often assume readers have context of previous recent commits and forget to explain details. When a reviewer asks for more context, I provide it in the commit messages.

* Try to temper your tendency to move quickly & substantially. Passion is great and I really do believe you mean well,

Thanks.

But can I offer another perspective? Looking at https://github.com/shadow-maint/shadow/pulls?q=sort%3Aupdated-desc+is%3Apr+is%3Aopen+author%3Aalejandro-colomar, many of the PRs are huge.

Considering some examples quickly:

* [Simplify string handling with a new API: stpsep() #1038](https://github.com/shadow-maint/shadow/pull/1038). I don't know what `stpsep` is.

A strsep(3) relative.

// Similar to strsep(3),
// but return the next token, and don't update the input pointer.

I don't think it's really obvious from the PR and I don't think it's obvious that it's a good API. Where's the motivation & justification for it?

* [7139f09](https://github.com/shadow-maint/shadow/commit/7139f094d6dba71e19944bc0330130ba6e8627bd). What cases is it easier in?

An example of a place you want to use it is after fgets(3):

Instead of:

if (fgets(buf, BUFSIZ, stdin) == NULL)
    err(1, "fgets");
p = strchr(buf, '\n');
if (p == NULL)
    errx(1, "fgets: Line too long");
strcpy(p, "");

you can write

if (fgets(buf, BUFSIZ, stdin) == NULL)
    err(1, "fgets");
if (stpsep(buf, "\n") == NULL)
    errx(1, "fgets: Line too long");

It's similar to strsep(3), but with strsep(3) it'd be more complex.

if (fgets(buf, BUFSIZ, stdin) == NULL)
    err(1, "fgets");
p = buf;
strsep(&p, "\n");
if (p == NULL)
    errx(1, "fgets: Line too long");

You might actually write something quite similar with strtok_r(3):

if (fgets(buf, BUFSIZ, stdin) == NULL)
    err(1, "fgets");
strtok_r(buf, "\n", &dummy);
if (strtok_r(NULL, "\n", &dummy) == NULL)
    errx(1, "fgets: Line too long");

(I'm not sure if the above is correct, since strtok*(3) is so complex that I have a hard time writing any code with it.)

There are a few other cases where it's useful.

* Within that PR ([Simplify string handling with a new API: stpsep() #1038](https://github.com/shadow-maint/shadow/pull/1038)), you then remove `stpcspn`. I don't know what `stpcspn` is either.

stpcspn() is almost the same as strcspn(3):

p += strcspn(p, "delim");
p = stpcspn(p, "delim");

(but not always you want to edit the same pointer in-place, so = is more versatile.)

Looking at shadow history, I see it was introduced just a day prior in Simplify string handling, and add some new APIs for that #1035, and now you're removing it. This suggests to me that these interfaces aren't mature and shadow is being used as a playground.

That made the change gradual. The diffs would be much more hell to read if I went from direct character operations to very abstracted calls like strsep(3).

Going through stpcspn(), strcspn(3), strchrnul(3), and other APIs added steps where you could observe intermediate transformations.

* Various invented functions like `str2ull` are using [reserved names](https://www.gnu.org/software/libc/manual/html_node/Reserved-Names.html), as they begins with `str`.

Since C23, only potentially reserved names. So was strtonum(3) and strtoi(3), but sometimes it makes sense to use them.

To me, it feels like a pattern of making huge PRs which are incrementally rewriting shadow into an Alex-DSL of C.

Alex-DSL of libc would be more precise.

Do you see why I'm both concerned about the churn, especially with these examples I mention above in mind where different APIs are being ported from/to, but also the new APIs being added aren't stable or necessarily well-thought-through if you're deleting them shortly after?

While the intermediate steps may not be the definitive code, I always design them so that they are a net improvement. We could have stopped the transformations there, and it would be a good state.

But it could be further improved, and so I did.

(I also think that being transparent isn't necessarily the same as being verbose.)

Then, amongst all this, there's issues like #624. Bugs happen and we all make errors. But do you see why I'm worried about huge amounts of churn making it pretty hard to review and notice substantive changes which do need discussion?

Sure.

But I'm not convinced by this, because if you could make it act correctly, you would've by now?

I didn't know what exactly I needed to change. I was missing the bug reports. Now I have them, and I have fixed most of them.

That sets in stone those APIs, which reduces the maintenance of that part. It means no more churn to those APIs, which should be a good thing?

I'm glad to hear that, but it's a great shame that there was so much churn in shadow until we got to that stabilisation point. shadow wasn't a good place to be doing that in.

I did it in shadow because I did it for fixing shadow's bugs. If I hadn't seen shadow's obfuscated (and incorrect) code, I probably wouldn't have had the need for writing these APIs. Elsewhere it's always been enough to me to use strtoi(3bsd). It was here that I realized that we needed something better.

Nevertheless, I sympathize with you not wanting to have to audit shadow every now and then. And also I assume there's always a chance to introduce obscure bugs such as the one in csrand_uniform().

I've wanted to stop churning shadow for several months already. I'm finishing my current patch sets, and I think I'm done. I already finished with allocations. Once I finish with the numeric parsing (already done a week ago) and string handling (I've got a couple of patch sets in my local repo), I'm done, and shadow will be stable again. And hopefully, much more stable than ever.

Sure; but I have a comprehensive set of tests in that library, so it would be harder to introduce bugs there than in shadow. In fact, I had a bug in the shadow version, which had already been fixed in liba2i (see 63297e8).

I also do really believe that various changes like 3c09e40 are a good thing.

Thanks.

But they're mixed in with other changes.

You probably mean the intermediate transformations.

I know that syndrome. But it is only a valid term when the API (or some usable replacement) is actually invented. In this case, it was actually Not Invented Anywhere.

I don't see any mention of the rationale in this PR.

The commit messages are often more informative, I think. Also, since it was originally a single PR, but ended up being 10 or so, I might have explained it in one of them, and then linked to it from others. It might be hard to find the rationale, unless you've participated in the entire discussion. Sorry. As always, I keep Cc: tags in the commits, so that you can always ask them for context that I omitted in the commit message.

It'd be good to state that. If you have a compelling case for adding it, make it?

I think I made it. It's just that the discussion lasted almost a year, spanned around 10 PRs, and around a hundred patches, and so it might be spread around those. Sorry.

Feel free to ask if you need any further clarifications about any changes.

I'm not saying that atol and strtol are good APIs.

What I'm saying is that:

* implementing alternatives has been mixed up with lots of other churn implementing functions which then get renamed or removed;

That was to not provide a big-bang commit that nobody would be able to review.

Instead, I wrote gradual commits that were reviewable on their own, and were net improvements on their own.

One negative consequence is having a hundred commits for doing it, yes.

* I don't feel a case has really been made for the value of using liba2i instead of the (admittedly far-smaller) API surface from libbsd, which shadow already depends on

The first issue with libbsd's APIs is strtou(3) is bogus on negative input. But I developed strtou_noneg() for that and I could have stopped there. Why I didn't?

I don't recall where specifically I said it, but the rationale was two reasons: type safety and typedefs.

typedefs:

Let's say you want to parse a foo_t (you don't know if it's signed or unsigned. Do you use strtoi(3), or strtou_noneg()?

You write a macro strton() that decides for you. But you need to pass the type to the macro, and the macro magic extracts the signedness from the TYPE.

I originally designed a macro that would be something like

     [u]intmax_t
     strton(TYPE, const char * restrict nptr, char ** restrict endptr, int base,
         [u]intmax_t lo, [u]intmax_t hi, int *rstatus);

But, then we come to type safety:

How do you guarantee that you passed the right type? Maybe I passed the wrong type.

I found that shadow already had the getlong() et al. functions. They gave me the idea of also passing the integer pointer to the macro, so that it would validate it against the TYPE.

That macro is a2i(). If you notice, a2i() has a prototype almost identical to strtoi(3bsd):

     int a2i( TYPE, TYPE *restrict n, QChar *s,
              QChar **_Nullable restrict endp, int base,
              TYPE min, TYPE max);
     intmax_t
     strtoi(const char * restrict nptr, char ** restrict endptr, int base,
         intmax_t lo, intmax_t hi, int *rstatus);

I added the first parameter for getting the type, and the second one for validating it. (I could have omitted the first one, but I prefer magic macros to be explicit; that lets readers know that the macro is rather magic.)

Once you pass the integer pointer, you need overloads for all basic integer types, and hence all the funny a2sll()-like names.

While doing that, I also decided to drop the non-standard rstatus code, since I could just use errno and return -1, which is more standard.

Sounds good?

I'm trying to say that I don't want to litigate the state of the C standard library. I'm saying that there should be a great deal of care in modifying a critical project, and in particular, due regard must be given to:

* Ease of review

* Maintainability over personal preference

* Appearances, too!

With particular regard to review, I'll note that the kernel uses coccinelle to great effect here. It allows reproducing large diffs, it allows clearly seeing the intent and effect of a change, and it also allows debate & review of the change itself to be done on a concrete description of the transformation.

Explaining how to reproduce your changes and how you made the transformations would be good.

It was all manually crafted with care. Nothing automated. I couldn't really explain much more than the diffs themselves. That's why I tried to do baby steps with intermediate functions.

I basically stumbled upon horrible code, then try to abstract a little bit from it, commit, rinse and repeat.

In some cases I found that it wasn't possible to abstract it because the code made no sense (it looked buggy) --here's one case where I'm still unsure of how I should abstract it: https://github.com/shadow-maint/shadow/issues/1041--, and then tried to guess what the author wanted to write instead. I've tried to highlight those during review (if the fix was more-or-less obvious) or in a bug report (if it wasn't obvious). In some very trivial cases I didn't report it, because there were so many bugs that it didn't scale well.

I think the library is the smallest change. Yeah, it has the build system, and packaging cost, but then you get the reward in the form of being allowed to call those APIs from any other projects. Since the only dependency of liba2i is libc, it should be usable almost everywhere.

But how stable are they? I get impression they aren't at all.

They are. I haven't modified the API at all since I wrote liba2i. While the implementation has changed to make it simpler, the API is intact.

And I've finished doing churn to string-to-numeric parsing in shadow. I don't think there's anything else that can be improved in that regard. (Except that shadow is still ignoring error codes all over the place, and I may need to check those. But that's not a melon I want to open now. Same thing with string truncation; there's a lot of it, and should be audited and fixed. I've provided the tools already, and now need some rest before starting part 2, which will be actually listening to those error codes.)

The library has the lowest cost, and the highest reward. It was indeed when writing my library that the strtoi(3) bug was found. If I had kept it just as a shadow internal library, we would have never found that, and you could be arguing that it was indeed a NIH syndrome by not using libbsd's strtoi(3). Now we know that libbsd's strtoi(3) wasn't as good as it sounded (and anyway, it's not good for typedefs).

I don't really know what to say there, there is value in rewriting things to understand how they work. It doesn't mean you should then make everything use that.

I would recommend you have a look at it. The main reasons are type safety and typedefs, as I said above. Of course, it's opt-in, but if I don't provide the library, you don't have the choice.

I have significantly reduced that since you last looked at it, I think.

That's good news, thank you.

:-)

I think I've been frustrated by trying to review massive diffs between releases in shadow and man-pages. I'm sorry if I've been rude to you and it wasn't intentional.

Thanks! No problem.

alejandro-colomar commented 1 month ago

Here's an incomplete list of reasons to avoid strtol(3). These are several obscure bugs found and fixed in a single gnulib call to strtol(3) in the last two days: