DevSolar / pdclib

The Public Domain C Library
https://pdclib.rootdirectory.de
Creative Commons Zero v1.0 Universal
224 stars 41 forks source link

Public strtok_r #4

Closed Luca1991 closed 5 years ago

Luca1991 commented 5 years ago

Hello, In one of the project that I've been involved, I had the need to use strtok_r. Since this function is private in pdclib, I had to use the one provided by Charlie Gordon (link for reference: https://groups.google.com/forum/#!msg/comp.lang.c/akY2fEnquSU/hGZkhrvssSoJ).

I'm opening this issue to ask about the possibility to have a public strtok_r in pdclib, so we can use the internal one.

Thanks

PS: adding @JayFoxRox to this ticket because we were following this issue toghether.

JayFoxRox commented 5 years ago

I actually made some other recommendation (against public strtok_r), because pdclib README says:

This is a C Standard Library. Nothing more, nothing less. No POSIX or other extensions, just what's defined in ISO/IEC 9899.

Here's my original suggestion: https://github.com/Cxbx-Reloaded/xbox_kernel_test_suite/pull/58#discussion_r285334612 (introducing it with a prefix, so it's hidden under most circumstances)

This is important, as to not conflict with other POSIX layers. - It might still be tricky to properly integrate it into header files (as one would have to wrap pdclib headers).

DevSolar commented 5 years ago

Turns out this whole issue (strtok*()) is a bit of a mess:

Naturally, I would prefer to go with the standard's strtok_s, and probably will for sheer stubbornness ;-) but I am afraid it won't satisfy anybody (no strtok_r, conflicting strtok_s)...

Any thoughts on this? Would you be satisfied with a standard strtok_s (to which a POSIX-style strtok_r wrapper could be conceived)?

JayFoxRox commented 5 years ago

C11 strtok_s sounds fine with me. I wasn't even aware it existed. I'm also against mixing POSIX or Windows API in this upstream libc.

strtok within pdclib can then be implemented by calling strtok_s internally (or a helper; as strtok_s should probably only exist for C11 and after). And strtok_r for POSIX can be layered as part of other projects (if necessary).

However, I wonder how to avoid conflicts with Windows strtok_s. This is also relevant for our original Xbox toolchain, as we also mimic Windows. Do we know how Microsoft or Clang for Windows deal with this?

DevSolar commented 5 years ago

The Annex K functions will only be declared if __STDC_WANT_LIB_EXT1__ is defined prior to including the standard headers. I guess Windows headers will just croak if it's defined, as Windows has several *_s() variants of standard functions in its API, and I don't think all of them match C11 definitions...

Handling that __STDC_WANT_LIB_EXT1__ -- and the standard's requirements for calling a constraint violation handler function instead of invoking UB if preconditions aren't met -- requires a bit of plumbing on the libc side, which I have already added to PDCLib.

I then wanted to implement strtok_s, looked for a reference implementation to test against, and found there isn't one (because, as I said, both POSIX and WinAPI disagree on parameters). At which point I shelved it for the day and went back to preparing my upcoming RPG campaign until inspiration strikes or one of you guys buggers me about it. Which just happened. ;-)

I could provide the "worker" function (with C11 strtok_s() functionality) as _PDCLIB_strtok(), and provide a C99 strtok() wrapper for that always, plus a C11 strtok_s() wrapper if __STDC_WANT_LIB_EXT1__ was defined.

This would give you the option to have a POSIX strtok_r() wrapper that accesses _PDCLIB_strtok(). How does that sound?

DevSolar commented 5 years ago

So... I commited an implementation of _PDCLIB_strtok(), which is referenced by both the (old) strtok() and (new) strtok_s().

The way to get at strtok_s():

#define __STDC_WANT_LIB_EXT1__ 1
#include <string.h>

Note that, in a translation unit, the define has to come before any includes of standard headers, as the definition is required to be consistent for all standard headers affected by it. I.e., you may not #include <stdlib.h> with no definition and #include <string.h> with definition. PDCLib will issue an #error diagnostic in this case.

strtok_r can be implemented like this:

#define __STDC_WANT_LIB_EXT1__
#include <string.h>

char * strtok_r( char * _PDCLIB_restrict s1, const char * _PDCLIB_restrict s2, char ** _PDCLIB_restrict ptr )
{
    rsize_t max;

    if ( s1 == NULL )
    {
        max = strlen( *ptr );
    }
    else
    {
        max = strlen( s1 );
    }

    return _PDCLIB_strtok( s1, &max, s2, ptr );
}

The additional strlen()s are a bit unfortunate, and right now the _PDCLIB_strtok() backend doesn't even "do" anything with the parameter, but I guess I will add appropriate bound checks to _PDCLIB_strtok() eventually (as they are required by the standard).

Using C11's strtok_s() instead of the strtok_r() above will save you those extra strlen()s, as the C11 parameter is preserved between calls the same way ptr is.

Does this solve the issue to your satisfaction?

Luca1991 commented 5 years ago

I think this should be perfect.

@JayFoxRox what do you think about it? Can I close this issue?

Thanks

JayFoxRox commented 5 years ago

Thanks for looking into it that quickly @DevSolar .

Regarding what @Luca1991 asked:

what do you think about it?

I'm not responsible for that part of our downstream toolchain. I'm not an expert about strtok either.

But skimming over it, the integration appears to be a good solution.

Can I close this issue?

You are the user that has to use it in their application (and owner of this issue): It's up to you.

(Worst case would be that it has to be re-opened or a new issue has to be created)

DevSolar commented 5 years ago

Closing this for now. Feel free to re-open if there should be any issues with the implementation.

Sidenote, I'm in the process of "polishing" this. Adding documentation for strtok_s() to string.h, adding the required-by-standard calling of the constraint handler function in case of runtime constraint violation etc.; meaning that, while _PDCLIB_strtok() should satisfy you already, it doesn't satisfy me yet as it's not yet up to C11 specs. Just so that you know. ;-)