clarkgrubb / hyperpolyglot

hyperpolyglot.org
Other
479 stars 96 forks source link

Some nits on C stuff #10

Closed mstewartgallus closed 9 years ago

mstewartgallus commented 9 years ago

The first nit.

variable:

/* if inside function, memory allocated on stack: */
int i;
int j = 3;

/* memory allocated on heap: */
int *ptr = malloc(sizeof(int));
*ptr = 7;

It really should check for an out of memory error. Also, it is more idiomatic to do malloc(sizeof *ptr). But these are just nits.

uninitialized variable:

Stack variables and heap variables allocated with malloc have indeterminate values. Global and static variables and heap variables allocated with calloc are zero-initialized.

This is incorrect. Reading from uninitialized variables is undefined behaviour and can result in bad things like the compiler deleting code. Those variables aren't just indeterminately initialized. Also, a very tiny nitpick is that calloc zero initializes bytes but does not necessarily zero initialize the value. Of course, now a days it doesn't matter but it used to be the case that some platforms had null pointer representations that weren't represented by zero bytes.

type size:

type size   sizeof(int)

/* also expressions and values: */
sizeof(1 + 1)

A nitpick, sizeof is an operator and not a function. It does not require parentheses for expressions.

boolean type:

int

/* includes macros for consistency with C++: */
#include <stdbool.h>

bool

No, stdbool.h aren't just macros. All standard conforming C99 implementations implement a _Bool type which is entirely separate from int.

integer type:

signed char 1+ byte
short int 2+ bytes
int 2+ bytes
long int 4+ bytes
long long int 4+ bytes

This is platform specific. If you want exact sized types use stdint.h which gives you types like uint16_t.

float type:

float type  float 4 bytes
double 8 bytes
long double 16 bytes

This is platform specific and also incorrect. The old way of doing math on x86 platforms had 80 bit floating point registers instead of 64 bit floating registers so for higher performance instead of normalizing numbers to 64 bits after every operation the bits were kept in for extended precision. This leads to enormously confusing and weird results such that even expressions like a == a may fail (or more likely something like, a = b + c, a == b + c`).

random number:

/* Value between 0 and RAND_MAX inclusive: */
int n = rand();

/* Value in interval [0.0, 1.0]: */
float x = (float)rand()/(float)RAND_MAX;

Please don't recommend rand. Lots of implementations are crappy and rand is defined to be seeded with the exact same value at program start anyways.

string type:

char *
wchar_t *

The size of wchar_t is compiler dependent; gcc and clang use 32 bits;
msvc uses 16 bits.

This is false. This is platform dependent and not compiler dependent (also many compilers have options to change its size anyways). For example, may cross compiler to Windows that is based on GCC has wchar_t as 16 bits.

longjmp:

void
callee(jmp_buf env) {
  longjmp(env, 3);
  /* unreachable */
}

void
caller() {
  jmp_buf env;
  int rv;

  if ((rv = setjmp(env)) == 0) {
    /* rv is 0 */
    callee(env);
    /* unreachable */
  }
  /* rv is 3 */
}

Technically speaking this goes against the POSIX standard.

An application shall ensure that an invocation of setjmp() appears in one of the following contexts only:

The entire controlling expression of a selection or iteration
statement

One operand of a relational or equality operator with the other
operand an integral constant expression, with the resulting
expression being the entire controlling expression of a selection
or iteration statement

The operand of a unary '!' operator with the resulting expression
being the entire controlling expression of a selection or
iteration

The entire expression of an expression statement (possibly cast to
void)

If the invocation appears in any other context, the behavior is undefined.

dirname and basename:

#include <libgen.h>

/* return pointers to statically allocated memory
   which is overwritten by subsequent calls */
char *s1 = dirname("/etc/hosts");
char *s2 = basename("/etc/hosts");

This is actually the behaviour of the prestandardised basename and dirname. Under the POSIX standard dirname and basename may modify their arguments which is shitty but what it is.

absolute pathname:

#include <limits.h>

char buf[PATH_MAX];

if (realpath("..", buf) == NULL) {
  perror("realpath failed");
}
else {
  /* use buf */
}

The situation with realpath is fucked up. Filesystems are dynamically loadable in many systems so there usually isn't such a thing as a constant PATH_MAX value (many platforms define it but don't obey it in practise). What you should be able to do is just realpath(foo, 0) but that can be buggy on certain versions of Mac OS X so I don't know a real solution to this.

set signal handler:

#include <signal.h>

void
handle_signal(int signo) {
  switch(signo) {
  case SIGUSR1:
    puts("caught SIGUSR1");
    break;
  default:
    printf("unexpected signal: %s", strsignal(signo));
    break;
  }
}

/*2nd arg can also be SIG_IGN or SIG_DFL */
sig_t prev_handler = signal(SIGUSR1, &handle_signal);

if (prev_handler == SIG_ERR) {
  perror("signal failed");
  exit(1);
}

This signal handler is not async-signal safe. It also doesn't properly handle platforms that clobber the signal handler after it is used. It also clobbers errno. On POSIX systems you can save errno, use write and then restore errno but on other systems things are fucky.

conditional compilation;

#if defined(__WIN32)
  win32_prinft("%f\n", x);
#else
  printf("%f\n", x);
#endif

defined doesn't need parantheses.

i/o errors:

Functions return values such as EOF, NULL, or -1 to indicate error. In some cases errors are not distinguished from end-of-file. The functions ferror() and feof() can be used to test a file handle.

The type of error is stored in errno. strerror(errno) converts the errors code to a string and perror() writes its argument to stderr with sterror(errno).

Please recommend strerror_r instead of strerror. Also, functions sometimes return error codes directly.

struct initialization:

struct medal_count spain = { "Spain", 3, 7, 4};

struct medal_count france = {
  .gold = 8,
  .silver = 7,
  .bronze = 9,
  .country = "France"
};

You can also do france = (struct medal_count) { .gold = 8, .silver = 7, .bronze = 9, .country = "France" };. which the detailed notes for this entry seem to have missed.

clarkgrubb commented 9 years ago

Lots of good feedback. I've incorporated it into the sheet. Thanks.