vorner / bumpalo-herd

Trying to create Sync bump allocator
Apache License 2.0
28 stars 5 forks source link

Couple of ideas #13

Open AndreiPashkin opened 5 months ago

AndreiPashkin commented 5 months ago

First of all I'd like to say that it is a great crate and it is very relevant to my work.

While working with it I came a cross of several ideas, not sure if @vorner would find them adequate:

  1. Add get_by_id() method which would allow to get a Bump attached to some kind of resource - a thread for example. Members acquired in such a way would not be shared between different resources - only within one resource designated by some kind of user-provided ID (thread ID, worker ID, coroutine ID, whatever).
  2. Add reset_unsafe() method so that if user could reset an individual member if he really knows what he's doing. That is supposed to play together with the above method.
vorner commented 5 months ago

Hello

I probably don't understand your use case, or what the advantage is. And it looks like it would increasing the complexity of the API, which I'm reluctant to do without a good reason.

Can you explain more?

AndreiPashkin commented 5 months ago

@vorner, my use case is that I'm building a small component that is supposed to fetch data from the datastore, quickly process it and pass down the pipeline and eventually to the network. And I want the memory of the output data (which is immutable) to stay the same after my component returns until the same thread doesn't get a new task - so that the processing pipeline could work with the data returned by my component and it won't suddenly change. This is to avoid re-allocation.

AndreiPashkin commented 5 months ago

@vorner, I'm also a bit puzzled with how to deal with bumpalo::collections - they expect bumpalo::Bump for their allocation, but that makes lifetime extension magic in bumpalo_herd::Member not work. And therefore lifetime of bumpalo::collections becomes tied to bumpalo_herd::Member instance's lifetime instead of bumpalo_herd::Herd - I wonder if you could say if I'm doing something wrong here?

vorner commented 5 months ago

I'm still not sure what you're doing โ€’ I can't seem to get it from your description. Can you maybe share a code example demonstrating it?

As for the collections, it's possible they are newer than when I've last touched bumpalo-herd ๐Ÿ˜‡. Not sure what to do about them out of my head.

AndreiPashkin commented 5 months ago

I'm still not sure what you're doing โ€’ I can't seem to get it from your description. Can you maybe share a code example demonstrating it?

As for the collections, it's possible they are newer than when I've last touched bumpalo-herd ๐Ÿ˜‡. Not sure what to do about them out of my head.

Yeah, I'll come up with a minimal example today and post it.

AndreiPashkin commented 5 months ago

@vorner, here is an example with bumpalo::collections that I was talking about: https://replit.com/@AndreiPashkin/FrankMuddyDisassembly

The problem is that Bump instance produced by Member::as_bump() has it's lifetime tied to the current scope, not to the Herd instance.

I wonder if you could suggest some workarounds?

vorner commented 4 months ago

I'm going over the example and the docs (both of bumpalo and bumpalo-herd). I'm starting to think that the collections and herd are incompatible in principle.

What Herd does it hands out the Bump allocators, but takes them back once the Member is terminated. It then can give them to some other thread. This is fine for one-off allocations that produce for example slices โ€’ once the slice is allocated, it is fine for the Bump to move to another thread and do more allocations there.

But the String holds a reference to the Bump, because it might want to do some future allocations when it grows. And this would be problematic, because then two threads could try to allocate from the same Bump at the same time โ€’ and Bump is not capable of that.

So I'm wondering about a new method on Member (maybe called something like steal_bump). That would give out a Bump with the 'h lifetime (eg. the one of Herd, not of the member, similar to how all the allocation ones are done). Furthermore, it would mark the Bump in a way that the Herd would not give it out to further threads and it would only get โ€žcleanโ€œ once reset is called.

Another option you could do is to use the String inside the thread, but then allocate the content again from the Member using alloc_str. That way it would โ€žfreezeโ€œ the string, it could no longer grow, but it would outlive the Member. The downside is it would waste more memory.

AndreiPashkin commented 4 months ago

@vorner, would be nice if bumpalo used some kind of trait instead of concrete type so that third-party library authors could make their own Bump implementations.

AndreiPashkin commented 4 months ago

@vorner, another way of doing this that I see is to make versions of steal_bump() and reset() for some specific task ID. So that you could release the memory but allocated within some concurrent task (aka thread).

vorner commented 4 months ago

Well, for one, Herd tries to do it the simple way and not deal with any IDs (not on the outside, nor on the inside).

Another one is, you probably don't get a nice API if you start dealing with IDs and I'm not sure you'll be stopped by the borrow checker from doing dangerous things with that kind of API.

AndreiPashkin commented 4 months ago

@vorner, for now - I don't see good solutions either (except convincing bumpalo maintainers to allow making custom Bump implementations, etc). I'm just speculating.