yahoo / mdbm

MDBM a very fast memory-mapped key/value store.
BSD 3-Clause "New" or "Revised" License
992 stars 111 forks source link

Support for building on OSX #36

Closed eam closed 9 years ago

eam commented 9 years ago
yahoocla commented 9 years ago

CLA is valid!

timrc-git commented 9 years ago

Hi Evan-

First, thanks for the patch!

I'm going to separate it out into makefile changes (can't believe /bin/true failed me ;) ), and the more substantial code changes. I'll post to a branch, and have you test.

On a separate note, I don't think we can rely on pthread_self() alone. On linux, it's not unique across processes. Allen tested on Mac, and it seems to have the same issue. We can probably use a hybrid of pid and pthread_self.

Thanks!

.timrc

areese commented 9 years ago

Here is my terrible test code:

include

include

include

include <sys/wait.h>

void print(void *p) { int i=(int)p; printf ("[%d] pid[%d] self 0x%x\n", i,getpid(), pthread_self()); }

void thread(int i) { pthread_t t; pthread_attr_t attr; pthread_attr_init(&attr); //pthread_attr_setdetachstate(&attr, 1); pthread_create(&t, &attr, print, i); }

int main(int argc, char **argv) {

print(0);
pid_t child[10];
for (int i=1;i<10;i++) {
    pid_t c = fork();
    if (0 == c) {
        for (int j=1;j<10;j++){
            thread(i*j*10);
        }
        // sleep(10);
        return 0;
    } else {
        print((void*)i);
        child[i]=c;
    }
}
        sleep(1);

for (int i=0;i<10;i++) {
    waitpid(child[i], NULL, 0);
}

}

eam commented 9 years ago

Excellent, glad this is useful. Good catch re: pthread_self(). The other odd thing I haven't figured out yet is the linker paths in the generated objects -- OSX linker appears to embed a path instead of an SONAME for dynamic dependencies:

booger:mdbm evan$ otool -L src/tools/object/mdbm_stat
src/tools/object/mdbm_stat:
    /usr/lib/libSystem.B.dylib (compatibility version 1.0.0, current version 1197.1.1)
    /usr/lib/libcrypto.0.9.8.dylib (compatibility version 0.9.8, current version 0.9.8)
    /usr/lib/libc++.1.dylib (compatibility version 1.0.0, current version 120.0.0)
    object/libmdbm.so (compatibility version 0.0.0, current version 0.0.0)

note the object/, I believe convention is to give the linker the fully qualified destination path.

I'm also not sure if a .dylib should be built on OSX instead of a .so. The currently produced objects do seem to work well enough. I've been using them successfully to natively build a few extensions for other languages without spinning up a VM.

areese commented 9 years ago

you should be using a .dylib on OS X, not a .so

this confused me the first time I was playing around

timrc-git commented 9 years ago

Evan and Allen-

I've mangled most of Evan's changes into the 'osx' branch. There were a number of things that were crufty/redundant/silly in the MDBM sources, so I tried to simplify as much as possible. Some things, I'd still like to change (like making Windowed mode disabled from a flag), but I'll hold off on those for now.

A few things remain to be done: Working out a version of gettid(). Picking a fast time function. Any dylib changes. Allen's java changes. Alternatives for O_DIRECT: fcntl() and F_NOCACHE, if needed.

For the time function, I ran across: http://stackoverflow.com/questions/5167269/clock-gettime-alternative-in-mac-os-x BSD gettimeofday was horribly slow, but it looks like the OSX one is not so bad.

Please take a look at the branch, and weigh in with any suggestions for the remaining issues. Thanks!

.timrc

timrc-git commented 9 years ago

Oh also, do you know why the strdup() was necessary in mash? It feels like a leak... was it just a const-cast issue?

eam commented 9 years ago

Are you perchance using a homebrew gcc rather than the Apple provided clang? Or maybe an older clang? I encounter a bunch of compiler errors (many of which are back on the osx branch) -- the strdup was for one of them:

mash.cc:1002:24: error: conversion from string literal to 'char *' is deprecated [-Werror,-Wc++11-compat-deprecated-writable-strings]
    rl_readline_name = "mash";

It looked like the initialize was a one-time thing so I didn't bother tracking it.

The osx branch seems to be missing clock_gettime() for me. And is missing the static / non-static conflict in get_window_page():

mdbm.c:8257:1: error: static declaration of 'get_window_page' follows non-static declaration
get_window_page(MDBM* db, mdbm_page_t* page, int pagenum, int npages, uint32_t off, uint32_t len)
^
/Users/evan/git/mdbm/include/mdbm_internal.h:884:21: note: previous declaration is here
extern mdbm_page_t* get_window_page(MDBM* db,

And the bitmasking is to avoid:

test_lockv3.cc:724:46: error: cast from pointer to smaller type 'int' loses information
                     << " parent: child=" << int(tid)

I'm building with

host:~ $ cc -v
Apple LLVM version 6.0 (clang-600.0.56) (based on LLVM 3.5svn)
Target: x86_64-apple-darwin14.3.0
Thread model: posix
areese commented 9 years ago

Tim doesn't have a mac, so I get to go chase this down once I have a minute. (something that ends with 47 is currently in my way).

Do you want to open another PR against the os-x branch to fix up these issues, and then I'll try building it in the next few days?

timrc-git commented 9 years ago

Sorry, I probably wasn't clear. The things listed as "remaining to be done" are not in the osx branch yet. So in particular, I still need to add a time function (has to be fast or turning on stats will trash performance), and the thread-id function.

Looks like I had a couple other regressions though (my apologies). And as Allen mentioned, I don't currently have a Mac to test on.

timrc-git commented 9 years ago

Ok, I've patched the regressions you mentioned, and added your time functions. I also added the most plausible, but fast, gettid() function I could think of on short notice. Please let me know if this works now. I'm also curious how it performs for you. Thanks!

eam commented 9 years ago

Aha, gotcha. I will give it a shot and send any new PR against that branch later this evening.

areese commented 9 years ago

It shouldn't be complaining about the rl_readline: http://www.delorie.com/gnu/docs/readline/rlman_28.html

but I'd strdup it anyways, it's a leak of a single string in a command line utility. =))

timrc-git commented 9 years ago

You don't know just how much time I spent making the code valgrind-clean including all the command-line utilities that have tests (except for some STL and internal libc stuff that we don't control). Note: I already used a non-leaky alternative here (function static buffer instead of strdup).

eam commented 9 years ago

I figured out why clang is upset about readline -- OSX doesn't use gnu readline! It uses NetBSD's editline, which makes rl_readline a char*: http://cvsweb.netbsd.org/bsdweb.cgi/src/lib/libedit/readline/readline.h?rev=1.35&content-type=text/x-cvsweb-markup

areese commented 9 years ago

we had a lib that was doing this, and then stomping the contents of argv. since it's a char*, we should probably strdup on os-x, or if we don't have gnu readline.

eam commented 9 years ago

Closing in favor of branch osx