e-kayrakli commented 7 years ago

An issue to keep track of API discussion

Standard must-haves

proc Queue(type eltType)
- an optional size argument?
- some param/const flags?
- an optional "targetLocales"?
proc enqueue(elt :eltType)
- possibly a vararg implementation
proc dequeue(): (bool, eltType)
proc clear()
proc length
proc size (equivalent to length)
proc isEmtpy

Some brainstorming:

Utilities:

iter these()
- Most likely the whole 4 iterators: sequential, leader, follower, standalone
proc copy() -- or clone, something for deep-copy

Handling asynchrony:

proc sync()
proc asyncEnqueue(elt :eltType)
proc asyncDequeue(elt :eltType): (bool, eltType)
- Maybe you can allow both synchronous and asynchronous operations to the queue? Then you'd need something like these. If the queue doesn't allow asynchrony due to some configuration parameters (likely params), you can throw a compiler warning in these functions and fallback to regular enqueue/dequeue

Bulk operations:

proc enqueueBulk(elt: [] eltType)
proc dequeueBulk(numElems): (bool, [] eltType)
proc concat(otherQueue)
- Obviously the same thing as enqueueBulk, you can pick one or have both signatures available
proc +=(q: Queue, elts)
proc +=(q: Queue, otherQueue)
- Such operators are supported in sparse and associative domains

Bounded queue:

proc capacity
proc requestCapacity(newCap)
proc isFull

e-kayrakli commented 7 years ago

Sorry for the lack of nice formatting, just wanted to capture some ideas quickly.

You can add new ones or argue about existing ones.

LouisJenkinsCS commented 7 years ago

Queue Specification

I am in favor of having a specification in terms of functionality, but not everything can be standardized, especially configuration and whatnot, so I believe we should have a 'Factory' of sorts which handles configuration and deciding which queue to use; as well, the actual queue should be transparent to the user. If the user wants more control, they can create a queue without needing the factory.

Configurations

Not all of them are applicable to all types of queues. I believe we probably should have a configuration object (as some kind of record...) if we're going for a productivity setup. This avoids having super-long constructors, and even allows reusing configurations and providing some basic/bare-bones configurations. Seriously, with how many diverse queues that we're making, the constructor could be 20 - 25 arguments long (and quite a few won't apply to all).

The type could be something like...

// Determines the 'FIFO' level of the queue. STRICT being that it must be as
// FIFO as possible across all nodes. LOOSE being that FIFO guarantees vary based
// on usage when it comes to multiple nodes, but at *least* has to be FIFO locally
// with respect to sequential operation. NONE means FIFO-ness is ignored entirely
// and opens up Work Stealing. Note: it is *not* possible to have STRICT FIFO-ness,
// but LOOSE allows `asynchronous` operations and *possibly* work stealing. Need to debate
// this one more though.
enum QueueConfigurationFIFOLevel {
  STRICT,
  LOOSE,
  NONE
}

record QueueConfiguration {
  // Element Type...
  type eltType;

  // Determines if the Queue needs to be FIFO or not.
  var ordering : QueueConfigurationFIFOLevel;

  // The default (if left nil) should be `SyncQueue`, but the user should be
  // able to provide their own queues. For instance, perhaps they want to use
  // `CCQueue`, or perhaps there is some spry new `Queue` that works better,
  // perhaps one that is even NUMA-aware? Such extensions are inevitable and I
  // believe it should be allowed.
  var customQueueFactory : func(Queue(eltType));

  // For bounded queues. If left 0, is unbounded...
  // This is important in that some algorithms do not work well
  // with bounded queues, and would select an entirely new queue
  // optimized for it. Some queues may support resizing, most do not.
  // Note: Resizing requires synchronization, and global synchronization
  // *can* work with GlobalAtomicObject but requires more research as its
  // honestly something that hasn't been done before.
  var maxElements : uint;

  // The target locales that the user wishes to use.
  var targetLocales : domain(Locales);

  // Memory allocation can be left to the user. Its possible that they may recycle
  // memory in such a way that is NUMA-friendly/biased, or maybe even have a brand
  // new allocator. I'd need to think on *how* to allow this though without ducking type system.
  var allocFn : func(c_void_ptr);
  var freeFn : func(c_void_ptr, void);
}

I'm thinking perhaps the above would be adequate... for a base configuration object type. I'm thinking that we extend this for more specific queues (and refactor some of these fields for others...)

Core Functionality

This is extremely important, as it is of course the core of the queue. I'll add the obvious here...

// Adds multiple elements...
proc enqueue(elt : eltType ... ?nElts);

// Adds in another queue's elements to our own in FIFO order. This operation is *not*
// lineraizable.
proc enqueue(queue : Queue(eltType));

// Add all elements yielded by some function. If a way to actually capture an iterator
// exists, perfect! Otherwise, we'd need to improvise with some kind of lambda that
// returns whatever is being yielded. (I hope this is possible. This would allow us to
// add *any* element to our queue from another other container that contained an element,
// *even* ones that were infinite streams... "Why would we find interest in infinite streams
// for a queue that can run out of space?" - It won't run out for a bounded queue and is a cool
// way to continuously add elements in an on-demand fashion. Perhaps if it is meant to be used
// in this fashion it could be it's own function?) Example of a good use-case: "How does my application
// scale when presented with near infinite (but bounded) amounts of data"? You can easily populate
// the queue repeatedly with dummy data/work. Just an idea...
proc enqueue(iter : func((eltType, bool)))

// Normal dequeue
proc dequeue() : (bool, eltType);

// Dequeue *multiple* elements.
proc dequeue(nElems) : (int, [] eltType);

I'll revisit this tomorrow, kind of mentally exhausted today

LouisJenkinsCS commented 7 years ago

As well, I'd like to put forward some hesitation in terms of the custom queue... while it would be great for some queues, my concern is for work stealing. Things like unroll blocks only work if we don't use their queues. However some algorithms do (in particular, FIFO does since it just uses a wait-free round robin algorithm), so perhaps customQueueFactory can be in an overloaded configuration object for FIFO alone? Probably best. Something like...

makeFIFO(
   type eltType, 
   customQueueFactory : func(Queue(eltType)), 
   maxElements : uint, 
   targetLocales : domain(Locales)
);

This could actually work well enough, extremely so actually.

e-kayrakli commented 7 years ago

For factory pattern:

I have no problem with that. I think I told you before but the Random module does something like that to generate RNGs.

That being said, for example constructor of distributions have a lot of arguments almost all of which have default values. And some of the arguments have no use in some cases. Just saying, so that don't limit yourself because of that reason.

I think you can also limit customization as you pointed out. I don't see any issue with that.

Passing something iterable to enqueue:

This is also a good idea and have some use in irregular domains. See https://github.com/chapel-lang/chapel/blob/master/modules/internal/ChapelArray.chpl#L3345

proc = has many overloads for domains. Almost all of them have explicit types. The one whic doesn't have any explicit type, is the least "specific" one to the compiler. So, anything like a=b in user code that doesn't resolve for any other overloads, will resolve for this overload, which has to be an iterable at this point. Now, Chapel doesn't have strict definition of an "iterable" at the user-level. But you can read it as anything that's either an iterator or something that implements iter these()

LouisJenkinsCS commented 7 years ago

Oh nice! In fact, there is no need to provide any of the other enqueue operations beyond the variadic argument one and the iterable one, as well we can use it for proc +=, As well for specific queues (not in the actual API) we can overload for specific iterable types, like if the queue is of another identifiable type, and merge/splice the nodes ourselves, etc. Anyway I'll try to further refine my above post tonight/tomorrow.

LouisJenkinsCS commented 7 years ago

Okay, I revised it a bit and even use chpldoc to generate proper documentation, but if anything I mucked up my repository even more doing so (going to make merging trunk into master that much harder too)

Here is the link to the queue documentation so far. One of the reasons why its taking me so long is that I kind of want to have it look nice. Of course a lot more is needed, but at least the basics for the GH-Pages is done.

LouisJenkinsCS commented 7 years ago

@e-kayrakli

Okay, this is for official review before posting in the actual issue. How do the docs look? I believe it should contain enough information to garner attention and grab people's attention (assuming they look at it at first).

Note the 'Open Question' asked shows a pretty cool looking example. I'll copy paste here...

// FIFO queue with capacity of one item...
var q1 : Queue(int) = makeBoundedFIFO(1);
// Adds the element '1' to the queue...
q1 += 1;
// Ensures we do not consume it when we concatenate in next step
q1.freeze();
// Constructs a new queue that inherits the bounded property of the source queue.
// Since the elements exceeds the max boundary of the original queue, the bounds
// of q2 will be 5. Furthermore, the queue will be frozen after creation.
// Note that, in this way, if we had nice inheritence for records we could easily
// manage creation of multiple queues like this. What if we had C++ rvalue move-constructors
// as well to make stuff like this efficient?
var q2 = q1 + (2,3,4,5);
// Can also be: + reduce (q1 + 1 + (2,3,4,5))
// But the above requires memory management of the temporary queue created midway.
var result = + reduce q2;

I believe thats attractive enough to warrant attention.

LouisJenkinsCS commented 7 years ago

Furthermore, I'd like to propose one more potential design decision: mapping.

// Btw is this allowed? Its equivalent to assigning a `nil` queue to a tuple of elements. Could work
// if the queue was a record...
var q1 : Queue(int) = (1, 2, 3, 4, 5);
// I still stand by having a new lambda operator...
q1.mapping(lambda(x:int) { return x * 2; });

The above example, q1, which is an unfrozen queue, applies the lambda function to all of its elements, mutating each element it finds, and returns itself; if q1 is was frozen it would return a brand new immutable queue with the transformations made on each element, preserving order.

e-kayrakli commented 7 years ago

I think you ought to come up with a better name than "Work Queue" for the unordered structure. Technically I can store things other than "Works" and it is not a "Queue".
"Freezing" queues can add an interesting value. But I think it can be moved to the issue, and not be in docs at this point. Because, in my view that is still an open question. Keep in mind that if you will go down that road, isFrozen method is a must.
The third paragraph in the "intro" talking about concatenation should move to somewhere more relevant. Possibly under 'enqueues`. Maybe add it to just one and say that the semantics are similar to the previous ones in the others. Or just repeat it. Such repetition is not uncommon in docs.
I don't understand what to take from the example. As we discussed, doing stuff like that could be "cool" but is there a compelling real-world use case? That is a more important question. Also, I thought we said that operator overloads are farther down the road. This may be some bullet in the issue like: "Do we overload operators such as "+, +=" as aliases for various concatenation functionality."
Your example shows a reduction, which is an entirely separate beast than concatenation. And it is more of an interesting use of a possible parallel iterator. I think that can be another bullet in the issue.
Your method docs must have arguments documented, as well.
In your queue enqueue overload; why are you maintaining the weakest ordering? My immediate thinking would be retain the ordering of this. Isn't it simpler to implement and more intuitive?
There are discrepancies in the last two enqueue signatures and their return types.
What does "then it is up to the user to be able to ‘replay’ elements not consumed and dropped" mean in iterObj enqueue?
I think this transactional argument can be an acceptable approach. But, this should still be brought up in the issue. "Should partial enqueues be allowed to bounded queues?" or even "Should non-scalar enqueues be allowed to bounded queues?"
Something funny happened with the second queue return type.
Somewhere discussing the "freeze" logic should clarify (or at least clearly ask) what would happen when user tries to enqueue to a frozen queue.
For mapping:
- Something along these lines can be interesting.
- You can make the first line work with a = overload.
- However, this again brings up the whole operator overloads question, which I believe are clouding some more important discussions. Going back to arrays: myArr + 2 would do what you are suggesting as mapping now. What I'm saying is not that operator overloads can do arbitrary mappings, but may be such operators should be overloaded so that they become sort of "vector" operations on the queue elements.
- This difference between semantics when the queue is frozen/unfrozen makes me worried.

Questions to be asked

Here are some questions that I think should you ask in the issue. You can expand on them with maybe couple of more sentences describing what we discussed about them.

Should we have queue iterator(s)? (I believe freezing and having multiple iterators are things we discussed here.)
Should we allow enqueueing anything non-scalar, at all? If so, what to do with bounded queues?
What are the possible operator overloads? -- Maybe this is not very urgent.

I think these are the most pressing ones. But feel free to add more.

LouisJenkinsCS commented 7 years ago

What does "then it is up to the user to be able to ‘replay’ elements not consumed and dropped" mean in iterObj enqueue?

Dropped means that, when rejected, they are lost forever (as the state of the queue may be 'rolled back', but not the iterObj's state).

e-kayrakli commented 7 years ago

Dropped means that, when rejected, they are lost forever (as the state of the queue may be 'rolled back', but not the iterObj's state).

I see. I think that behavior is clear in the context of Chapel. And that sentence got me confused a bit. I suggest not mentioning anything like that there.

LouisJenkinsCS commented 7 years ago

I think you ought to come up with a better name than "Work Queue" for the unordered structure. Technically I can store things other than "Works" and it is not a "Queue".

I've been thinking on this for a while, but I'm coming up short. Technically you are correct in that the name isn't fitting... but at the same time, the fact that it isn't a 'Queue' makes me want to reconsider the whole interface thing entirely... Its more of a UnorderedList than anything... I'm curious if at this point, should I remove the entire interface thing? At least then I can make it a record instead of a class and allow for automatic memory reclamation... oh yeah, since Chapel doesn't have 'move' constructors, is the destructor called on a record returned by value since it goes out of scope? Since Chapel lacks smart pointers (and its very tricky managing across nodes), perhaps we not.

Anyway, one more question to ask is: "Should we have an interface at all?"

Your method docs must have arguments documented, as well.

I see in some of the official docs that they lack documentation on a few of the methods. However, if you think it is best, I'll do so.

I'll have the queue interface done by tonight so we can go over it tomorrow morning.

LouisJenkinsCS commented 7 years ago

I updated the API again: here.

However, as seen on the second dequeue, here, it seems to not be able to process returning arrays well (although I'm a bit confused on the semantics of it myself: does the array size need to be captured at call-site?)

e-kayrakli commented 7 years ago

is the destructor called on a record returned by value since it goes out of scope

I don't want to mislead you on this one. You should try it out yourself. You can even inspect the C code to understand exactly what is happening.

Anyway, one more question to ask is: "Should we have an interface at all?"

I guess it is true that if we don't want to dub this a queue (which is the direction I think we should take), then this guy doesn't have to/shouldn't follow this interface. My suggestion at this point is this: in the factory method that generates this kind of "queue", you can put a sentence summarizing our current standpoint, which is along the lines of: "This data structure is not technically a queue and will have its own similar interface in the future".

I think it is OK to leave it at that at this point. It is obvious that there will be something analogous to enqueue/dequeue etc with different names.

I see in some of the official docs that they lack documentation on a few of the methods. However, if you think it is best, I'll do so.

You are right. But let's make this complete. I believe it'll come down to bunch of copy/paste anyways. Besides, there are some things that are not immediately obvious to the reader like the transactional flag. They'd know what that is if they read the introduction, but let's be honest, programmers generally don't do that :)

However, as seen on the second dequeue, here, it seems to not be able to process returning arrays well

I skimmed over some of the docs and couldn't see an example. Maybe something buggy.. You can leave it like this for now.

(I have noticed that return type say bool,int for the first dequeue)

(although I'm a bit confused on the semantics of it myself: does the array size need to be captured at call-site?)

No, you do not. But maybe in the doc return type shouldn't have the domain query (?n)

LouisJenkinsCS commented 7 years ago

But maybe in the doc return type shouldn't have the domain query (?n)

If I remove it, it will not compile at all. Definitely seems like a bug then. Also, if the domain query thing is unused, why is it necessary to compile? Is that a bug with syntax of arrays?

LouisJenkinsCS / Distributed-Data-Structures

Interface design #16

Standard must-haves

Some brainstorming:

Queue Specification

Configurations

Core Functionality

Questions to be asked