Open gabrieldemarmiesse opened 6 months ago
@gabrieldemarmiesse I was thinking about this a couple days ago so I'd like to take a crack at it if you haven't already started
Let's wait for the go of the maintainers before working on a new struct.
To be aligned with current naming convention, it should be Deque, not deque ;)
If you want a space to make a Deque struct before core Mojo maintainers give the green-light for std-lib, you are more than welcome to submit PRs to toybox. Honestly, I'm disappointed with myself for overlooking putting deque, ring-queue, and the like to my "hit list" of desired data structures.
I'm +1 for a deque collection. Do you want to write up a one pager on an API design proposal for consideration and we can take a look to discuss before starting down an implementation? You may want to check out https://doc.rust-lang.org/std/collections/struct.VecDeque.html as something to consider too.
I implemented mojo Deque for myself, but tried to make it properly documented and tested so that it could be a candidate for a mojo stdlib if approved. You are welcome to check it here if mojo development priorities allow that: https://github.com/avitkauskas/mojo-datastructs/blob/main/datastructs/deque.mojo
It's implemented on a growing and shrinking circular buffer via UnsafePointer. I implemented all the python deque API and a bit more. On the trivial benchmark of 4 billion appends/pops, it's about 1.5x faster than C++ stdlib implementation on the default construction and can be easily configured to minimize memory allocations if speed is more important than memory, providing 2.5x speed-up compared to C++.
-63.7% 2.76x faster 2507.1ms ± σ: 1.4ms, 2505.1ms…2510.4ms `./deque-mojo-no-shrink`
-33.3% 1.50x faster 4609.3ms ± σ: 6.0ms, 4603.2ms…4625.9ms `./deque-mojo`
base 6915.2ms ± σ: 3.7ms, 6910.1ms…6922.2ms `./deque-cpp`
Copy/move/iter on non-trivial elements have yet to be properly tested and, most probably, refined. Please let me know if you'd want me to make a proper PR to mojo repo where we could work on the further improvements.
I think I've put enough now into the code, docstrings and tests to make it worth a few minutes of time of @JoeLoser. I would be glad to hear any comments, and if there is any interest of me making a PR to include this Deque implementation into the Mojo standard library. I know it's big. But I hope we'd be able to move piece by piece with comments and improvements. I'd be glad to hear anything back on this. Thanks! Repo: https://github.com/avitkauskas/mojo-datastructs/tree/main Code: https://github.com/avitkauskas/mojo-datastructs/blob/main/datastructs/deque.mojo
I think I've put enough now into the code, docstrings and tests to make it worth a few minutes of time of @JoeLoser. I would be glad to hear any comments, and if there is any interest of me making a PR to include this Deque implementation into the Mojo standard library. I know it's big. But I hope we'd be able to move piece by piece with comments and improvements. I'd be glad to hear anything back on this. Thanks! Repo: https://github.com/avitkauskas/mojo-datastructs/tree/main Code: https://github.com/avitkauskas/mojo-datastructs/blob/main/datastructs/deque.mojo
Nice! I'll take a look at this in the next few days and provide feedback here and we can talk about the best tactical way to split up things incrementally to adopt it into the stdlib 👍
I think I've put enough now into the code, docstrings and tests to make it worth a few minutes of time of @JoeLoser. I would be glad to hear any comments, and if there is any interest of me making a PR to include this Deque implementation into the Mojo standard library. I know it's big. But I hope we'd be able to move piece by piece with comments and improvements. I'd be glad to hear anything back on this. Thanks! Repo: https://github.com/avitkauskas/mojo-datastructs/tree/main Code: https://github.com/avitkauskas/mojo-datastructs/blob/main/datastructs/deque.mojo
This is a great implementation! I'd happily welcome this into the standard library. I'd suggest thinking about how to split up the changes into a few smaller PRs (e.g. constructors + few APIs along with tests, and keep rinse/repeating) to make it a bit easier for review. It's helpful seeing what the big end state would look like roughly without the splitting up of work. To help make it easier for me and the @modularml/mojo-standard-library team to review, smaller PRs would be helpful though.
Some high-level thoughts in no specific order:
UInt
instead of Int
for some of these data members which are known to be non-negative (e.g. size, capacity, etc.)? UInt
is relatively new, so a lot of the existing data structures haven't used it yet internally, but may be nice to explore.add
dunder should specify the capacity of the one deque it's getting its elements from to avoid reallocations and not rely on the default capacity of 64
mul
and imul
dunder could use UInt
eq
could use getitem
to simplify some of the code, or did you avoid that to avoid the overhead with index normalization done in the getitem
implementation since it's not needed in eq
and friends?I'm glad you liked it, Joe! I will clone the nightly mojo later this week, and will make the PR with the first part of the implementation and related tests. I hope chunks of code of about 200-300 LOC + tests would be acceptable for reviewers.
Answering the questions:
Have you considered using UInt instead of Int for some of these data members which are known to be non-negative (e.g. size, capacity, etc.)? UInt is relatively new, so a lot of the existing data structures haven't used it yet internally, but may be nice to explore.
I'm afraid that using UInt
for capacity etc. could be asking for trouble. Checking in the mojo REPL, Int
is implicitly casted to UInt
:
var a: UInt
var b = -1
a = b
print(a) # 18446744073709551615
Imagine I'd accidentaly do q = Deque[Int](capacity=b)
...
So, in my opinion, taking Int
and explicitly deciding how to handle negative numbers gives less surprises.
mul and imul dunder could use UInt
Same with mul
: what would happen when you do q * b
when b
is negative Int
?
Pyhon returns an empty deque when multiplied by the negative number. I followed their API.
add dunder should specify the capacity of the one deque it's getting its elements from to avoid reallocations and not rely on the default capacity of 64
The thing is that Deque is a totally different animal compared, say, to List: it can have capacity set to some high number by the user initially and have just a few elements yet, and (in my implementation) also be restricted from shrinking, it could also have maxlen
property set. And in python, the result of add
inherits the properties of the first addend (that is self
). Therefore, it would not be right to reset capacity to some number like len(q1) + len(q2)
as sometimes it could be smaller that the initial capacity of q1. And it's not good to set it to q1.capacity + q2.capacity
because q1.capacity
could be big enough already. Overall, it could be made a bit more efficient with allocations if we precalculate the capacity of the result to be not less than q1.capacity
and big enough to accomodate both deques and allocate it before the addition. Could be discussed if it's worth the trouble. And also, now a = a + b
results in the deque with the same characteristics as a += b
that makes sense to me.
eq could use getitem to simplify some of the code, or did you avoid that to avoid the overhead with index normalization done in the getitem implementation since it's not needed in eq and friends?
Yes, exactly, I do not use getitem
in the implementation of other methods for efficiency.
@avitkauskas thanks for the discussion - all of your responses make sense and SGTM. Those PR-sizes of a few hundred LoC is totally reasonable 👍
Review Mojo's priorities
What is your request?
Add a Deque struct to the stdlib, similar to python's deque: https://docs.python.org/3/library/collections.html#collections.deque
What is your motivation for this change?
While it's a very useful collection to have in general, in the context of the standard library it offers some new possibilities. There are multiple places in the stdlib where we want to "consume" objects in a list in order. Doing so currently means either 1) calling
List.pop(0)
, while safe, this is very inneficient as all elements must be moved at every pop() 2) Using a pointer andmove_pointee
. While this is efficient, this is quite unsafe.Moving a list pointer into a deque would allow us to call
deque.popleft()
at very little cost. Thus consuming the elements in order.Any other details?
The two places where we currently have this issue in the stdlib are
List.extends()
andDict.fromkeys()