OpenRC / openrc

The OpenRC init system
BSD 2-Clause "Simplified" License
1.48k stars 248 forks source link

improve string handling in OpenRC #207

Open williamh opened 6 years ago

williamh commented 6 years ago

This issue was prompted by #206.

This issue is being opened so we can have a place, not tied to a specific pull request, to discuss improvements to OpenRC's string handling. @wltjr @vapier I want to tag both of you because of the mentioned p/r, so we can be on the same page. @wltjr: I would like to know more about your p/r. Why do you feel that using those functions is always insecure?

To everyone on the bug, I have a concern of my own. We are relying on PATH_MAX and MAXPATHLEN to define the max length of paths, but these are actually not universally defined constants.

https://eklitzke.org/path-max-is-tricky https://www.gnu.org/software/hurd/hurd/porting/guidelines.html

I would like some input on the best way to deal with this.

Thanks,

William

vapier commented 6 years ago

yes, GNU/Hurd is a notable example that doesn't define any path constants. that said, mk/os_GNU.mk apparently has some fallbacks for that target, so it's already been ignored to a degree.

just switch everything to asprintf and there's no need to hardcode PATH_MAX or any other static length buffer on the stack when forming file system paths. it does mean heap usage increases and technically perf might not be as good, but i dobut these are code paths where those microseconds will be noticed outside of system noise.

i haven't audited every use of PATH_MAX, but from the ones i've glanced at, they're trivial to replace with asprintf, or PATH_MAX is the wrong define in the first place (like librc-daemon.c:pid_is_argv and supervise-daemon.c:child_process using PATH_MAX for the contents of "cmdline" which is not PATH_MAX in size by any means). those other cases would be better served with a util func that, given a path, would take care of dynamically allocating/extending a buffer and reading all of the contents in at once.

asprintf would also make it easy to fix all the missing error checks where we use snprintf today and don't look at the return value.

wltjr commented 6 years ago

@williamh as I understand it, unless for performance reasons, certain functions should be avoided in favor of safer alternatives. Any man page on them, most any documentation, best coding practices, and definitely secure coding practices. I provided a link that sums it up pretty nicely. That contains 4 other links for further reading.

I was just taking a look at OpenRC codebase. As part of my own coding practices, I use various tools to help me code better, and identify potential issues, or things I did not catch. I believe there are other more serious issues to address in the OpenRC codebase per those findings. I highly recommend using Coverity, clang scan-build, and/or Sonar. It seems each may pickup something the other missed. Sonar can be shared, so if you all log in there. I can give you access/control if interested. Just the same I have travis.yml files for Sonar and Coverity I can send over in a PR or can be pillaged.

After doing a scan or two of OpenRC code base. I was just getting my feet wet with a what I felt was a small contribution. To get a feel for the process and then move onto some of the other issues identified. Though I know from my own coding. There is different ways to address the various issues. I can appreciate rather than wanting to use my replacements to go in another direction.

I do not have any input or suggestions on the PATH_MAX issue. I face a similar issue in some of my own projects. Though at the moment less concerned about portability, so it is moot for me for now. That is good it was noticed, and to be discussed.

I think there are some other issues beyond strings and path to be addressed pointed out by Coverity, clang scan-build, and Sonar. I can address some of those. Though I have my own code to scan and fix. The first attempt at contribution wasn't so smooth or pleasant for something that was pretty minor IMHO. Given changes necessary for the other issues. You all may want to address it differently than I may in code.

wrt to asprintf, not all memory now is dynamic and will also require changes to ensure any allocated memory is free'd. For embedded and battery powered stuff I believe allocation also has a greater price. But that is likely moot for openrc. Though it is running all the time.

williamh commented 6 years ago

@vapier I like what I'm reading about asprintf. Which platforms have it? is it just Linux, or is it available on *bsd as well?

vapier commented 6 years ago

yes, bsd's have asprintf, as do all reasonable linux C libs (glibc/uclibc/musl/etc...). if need be, defining a fallback is trivial.

in fact, here you go. this took me <5 min to write up.

int asprintf(char **strp, const char *fmt, ...) {
  va_list ap;
  int len, memlen;
  char *ret;

  /* Start with a buffer size that should cover the vast majority of uses (path construction). */
  memlen = 4096;
  ret = malloc(memlen);
  if (!ret)
    return -1;

  va_start(ap, fmt);
  len = vsnprintf(ret, memlen, fmt, ap);
  va_end(ap);
  if (len >= memlen) {
    /* Output was truncated, so increase buffer to exactly what we need. */
    memlen = len + 1;
    ret = realloc(ret, memlen);
    if (!ret)
      return -1;
    va_start(ap, fmt);
    len = vsnprintf(ret, len + 1, fmt, ap);
    va_end(ap);
    if (len >= memlen) {
       /* Give up! */
      free(ret);
      return -1;
    }
  }
  if (len < 0) {
    free(ret);
    return -1;
  }

  *strp = ret;
  return len;
}
williamh commented 6 years ago

@vapier heh, I just tested, and it looks like you have to define _GNU_SOURCE to get it, so I'll probably put your fallback in the source and use our xmalloc helper instead of malloc.

williamh commented 6 years ago

After looking at this, I'll probably not use the xmalloc helper after all.

williamh commented 6 years ago

@vapier I was able to come up with a version of the function you entered above which uses our allocators, xmalloc and xrealloc, to handle memory allocation. I named the function xasprintf, because it will need to be abailable everywhere and the name needs to be unique to us. Tell me what you think before I start migrating to it.

@wltjr, what are your thoughts as well?

vapier commented 6 years ago

yeah, _GNU_SOURCE is required on Linux libs to get the prototype because it's not in POSIX. i don't know about the BSDs, but FreeBSD doesn't seem to.

if it uses xmalloc/xrealloc (which die on failure), then calling it xasprintf sounds fine. that'd further simplify/guarantee correct behavior at runtime. it's also a common idiom in other code bases.

i wouldn't put it in a header though ... the func isn't exactly small, so would be better as an external func that gets compiled once.

williamh commented 6 years ago

@vapier Sure, I agree that there should be a better way to handle the functions in helpers.h than compiling them multiple times, but I'm not quite sure how to do that with the current build system. Once we port to Meson (after some bugs there are fixed in a release), I think that will be a lot easier to deal with. Right now, src/includes is the only spot in the tree where common files (files common to librc, libeinfo and the OpenRC binaries) are stored.

wltjr commented 6 years ago

@williamh my only input is make sure any memory is free'd. Some things will changed from fixed to dynamic. That may require further changes. The function itself is fine. Though if you would like something similar for reference. mstdlib provides similar functions which should be good across a variety of platforms. A bit hard to follow since its using lots of their own functions. Not suggesting to use that library. Just one I know that covers such.

Minor thing, may want to initialize vars, even numeric

va_list ap = NULL;
int len = 0;
int memlen = 0;
char *ret = NULL;

Regarding build, if you switch to Meson or other, or regardless. May want to consider cleaning up root directory. Seems many things could go into a more common data/ folder, like assets. Stuff under src/ is fine it is the rest that seems bit messy in root. Just a pet peeve of mine.

Probably best to create a new directory like src/common or something. Then update build accordingly.

williamh commented 6 years ago

@wltjr yeah, I know the root of the tree has a lot of things under it that should be re-arranged. Actually most of them would fit nicely under etc because that's where they land when you install.

williamh commented 6 years ago

@vapier Using xmalloc and xrealloc doesn't remove the need to check the return code as far as I can tell. Does it make sense to go further in xasprintf and fail if the return code is going to be -1?

wltjr commented 6 years ago

@williamh even if you create etc dir, I think its best to go under data. I do that for jem. For entrance I just have Meson put the files under etc, its only 2 files, so need need for another subdir. I think openrc has more files so likely would have many dirs under data, like etc, init.d, local.d, man, and others. Basically same as they exist now just moved into data. Leaving the other stuff like src, mk, pkgconfig, test, and maybe some others.

vapier commented 6 years ago

the build system doesn't really matter to the source code layout. do it like rc_getline, and add a #define xasprintf rc_asprintf if you want. semi-related, should just delete rc_getline now ... been a decade at this point.

the only realistic failure modes of asprintf are memory allocation which are handled by xmalloc/xrealloc. for the unlikely edge cases, just add a failure/exit case. that's the entire point of x* funcs -- they, by definition, are guaranteed to succeed or exit/abort.

as for the code, the vars are already initialized correctly, and doing something like ap = NULL not only makes no sense, it's invalid. that's simply not how stdargs works, and never have worked.