[Discussion] Moving out of beta

sustrik commented 7 years ago

I've done some API cleanup and I think we are ready to move out of beta.

What may be missing:

custom stack allocators (can be done using go_s in separate libs as needed)
buffered queues (can be provided in external libs)
census should report at the end of process rather than at the end of thread (not critical)

Thoughts? Objections?

raedwulf commented 7 years ago

I'd say the API for custom stack allocators could be done first as this has the potential to break the current ABI. Perhaps, just have the current allocator abstracted into this form.

This is what you proposed previously (from a closed PR):

struct stackalloc_vfs {
    void *(*alloc)(struct stackalloc_vfs *vfs, size_t len);
    int (*free)(struct stackalloc_vfs *vfs, void *ptr, size_t len);
};

DILL_EXPORT extern const struct stackalloc_vfs *stackalloc_default;

#define go_stack(fn, ptr, len) ...
#define go_alloc(fn, alloc, len) ...
#define go(fn) go_alloc(fn, stackalloc_default, 256 * 1024)

I think that could still work. We'd need to tell the user that allocated many different sizes of stack is inefficient and should not be done with the cached allocator (it creates a new cache per size). 2-3 different sizes is fine.

However, this does require some form of lookup between different caches... so perhaps having the allocator be constructed with its size would be better.

struct stackalloc_vfs {
    void *(*alloc)(struct stackalloc_vfs *vfs);
    int (*free)(struct stackalloc_vfs *vfs, void *ptr);
};

DILL_EXPORT extern const struct stackalloc_vfs *stackalloc_default_256;
DILL_EXPORT extern const struct stackalloc_vfs *stackalloc_8;
DILL_EXPORT extern const struct stackalloc_vfs *stackalloc_64;

#define go_stack(fn, ptr, len) ...
#define go_alloc(fn, alloc, len) ...
#define go(fn) go_alloc(fn, stackalloc_default_256)

For buffered queues, external libs sounds like a good option - keeping libdill small and complete helps us progress.
For census, yes, that makes sense. I suggest at the end of each thread, mutex locking on a global linked list and adding census data to that list for printing out at the end of the process would be the simplest/most sensible approach to go.

And an additional point from the other issue:

The chstorage size is the size used on x86-64, should we optimize for specific platforms and fallback to 72? (effort looking on each platform - I can do this when I have a few spare moments to set up my SBCs) Or should we have a runtime query interface? (added API complexity).

And adding on to the topic of memory allocation:

Do we want a general memory allocator that can be plugged into any part of libdill? This effectively has multiple pools and the potential for better code sharing. This is what is happening with channel at any rate and could be extended to handles.

struct alloc_vfs {
    void *(*alloc)(struct stackalloc_vfs *vfs);
    int (*free)(struct stackalloc_vfs *vfs, void *ptr);
};

DILL_EXPORT extern const struct alloc_vfs *guardalloc_default_256;
DILL_EXPORT extern const struct alloc_vfs *guardalloc_8;
DILL_EXPORT extern const struct alloc_vfs *guardalloc_64;
DILL_EXPORT extern const struct alloc_vfs *chanalloc;

#define go_stack(fn, ptr, len) ...
#define go_alloc(fn, alloc, len) ...
#define go(fn) go_alloc(fn, guardalloc_default_256)

DILL_EXPORT int chmake_s(size_t itemsz, struct alloc_vfs *alloc);
#define chmake(itemsz) chmake_s(itemsz, chanalloc)

sustrik commented 7 years ago

As for memory allocators, it occured to me that you can create new ones without any extra APIs, just using hmake and go_stack. Something along these lines:

#define mygo(fn) ({\
        void *stk = malloc(10000);
        int h = myalloc_make();
        int u = go_stack(fn, stk, 10000);
        myalloc_set_underlying_handle(h, u);
        h;
    })

It's a bit convoluted, but it's not like writing custom allocators is a common use case, so it doesn't matter that much. Or can we do better without introducing too much API complexity?

sustrik commented 7 years ago

The chstorage size is the size used on x86-64, should we optimize for specific platforms and fallback to 72? (effort looking on each platform

Possibly. But I don't like the idea of determinig the sizes using autoconf. I would prefer the code not to be too coupled with the build system.

sustrik commented 7 years ago

Ad generic memory allocators: IIRC alloc libraries (tcmalloc and friends) operate via overriding malloc/free symbols. No need for any special support from the user.

raedwulf commented 7 years ago

Sorry for the late reply.

As for memory allocators, it occured to me that you can create new ones without any extra APIs, just using hmake and go_stack

Part of the reason was so memory allocated could be automatically freed on hclose.

Ad generic memory allocators: IIRC alloc libraries (tcmalloc and friends) operate via overriding malloc/free symbols. No need for any special support from the user.

Yeah, that's true.

raedwulf commented 7 years ago

Possibly. But I don't like the idea of determinig the sizes using autoconf. I would prefer the code not to be too coupled with the build system.

I thought it could be easily determined from the build system but I was wrong. Also, it makes it difficult to cross compile. I think just listing the platforms should be sufficient.

sustrik commented 7 years ago

Part of the reason was so memory allocated could be automatically freed on hclose.

Yes, that's why you need an additional handle. When the handle is closed the underlying coroutine handle is closed and the memory is deallocated.

raedwulf commented 7 years ago

Oops, yeah that makes sense. Your approach feels cleaner in some respects. Let's go with that for now?

sustrik commented 7 years ago

Done.

sustrik / libdill

[Discussion] Moving out of beta #59