Open ByronBecker opened 2 years ago
Thanks for this report, @ByronBecker!
I wonder if you could repeat the experiment with another existing base structure in place of the red-black tree.
If you are only inserting, for instance, it would be interesting to use lists, and always prepend (in constant time). Since we know that will always be constant time, it would be an interesting comparison to see what limit, if any, you reach with that data structure in place of where you currently use the RBTree. WDYT?
BTW -- I have been preoccupied with family life recently, but still plan to implement the imperative B-Tree asap. Of course, that comparison will be interesting too, when we can do it.
Right, I could totally test out the LinkedList - the only issue is that I also would like performant lookups, as I expect the database to be more read-heavy than write heavy. O(n) lookups would most likely hit the instruction limit once the data is large enough
Yep, makes sense that linked lists will be impractical. But they would be useful (in the insertion-only workload) for seeing what the best possible performance could be, assuming you don't worry about the order as you insert. It's an experiment to get a lower bound for insertion instruction cost (no rebalancing, and only storing data.). OTOH, as we discussed elsewhere, this version is not actually useful for implementing CanDB, so it would only be to gather data and get this bound.
What Motoko GC are you using?
Now I understand a bit better what you were saying - that would be a good comparison to see also if the canister heap size has any impact on what should be an O(1) operation.
I've been using the default that ships with 0.10.0 - is that --copying-gc
?
Should I try to use this https://github.com/dfinity/motoko/pull/3231?
I've been using the default that ships with 0.10.0 - is that
--copying-gc
?
Yep, we cannot make --compacting-gc
the default until the replica supports DTS (deterministic time slicing).
@ByronBecker I can't see the code where you scale and presumably instantiate another canister. Is it possible that the spun up canisters aren't being provisioned with enough cycles before calling the class constructor?
@crusso This is where the canister gets spun up https://github.com/canscale/CanDB/blob/auto-scaling-poc/src/CanDB.mo#L173
func scaleCanister(db: DB): async () {
let indexCanister = actor(db.scalingOptions.autoScalingCanisterId): actor { createAdditionalCanisterForPK: shared (pk: Text) -> async Text };
try {
let newCanisterId = await indexCanister.createAdditionalCanisterForPK(db.pk);
Debug.print("canister creation success for pk=" # db.pk # ", canisterId=" # newCanisterId)
} catch (err) {
db.scalingStatus := #not_started;
Debug.print("error auto-scale creating canister in CanDB: " # Error.message(err));
};
};
This line let newCanisterId = await indexCanister.createAdditionalCanisterForPK(db.pk);
calls a method that it expects the index canister to have. I've created a templating tool that requires this method to be there on the canister responsible for auto-scaling
Here's how that createAdditionalCanisterForPK
tool is used
Claudio I just gave you access to both of these repos (candb-client-typescript & candb-quickstart-template). Others can request if interested.
I can almost assure you that this is not a cycles issues as canister that I spun up from the index canister and was receiving and inserting into the RBTree has not run out of cycles (it had over 6T cycles before I spun it down).
Also, this happened before the canister scaling limit I had set of heap size at 1.25GB. The error occurred several times in a range of heap size of 290MB-500KB, eventually topping out at 490MB where I was unable to insert a single row without hitting this limit.
Is it possible there's a bug in RBT code that causes it to go into an infinite loop that is broken by the cycles bound?
My guess was this
"My guess is that this could either be due to inefficiencies in re-balancing the Red-Black Tree, or that the tree has gotten so deep that recursive operations are much more costly by (more than linear factor) due to the call stack or some other factor."
I did a pretty large amount of testing for my RangeTree data structure(essentially a typed RBTree), and the code keeps everything ordered, but it remains to be seen how many of these data structures respond on the main network and large amounts of data (i.e. 1-2GB Heap size). Much of the feedback is anecdotal from what I've seen.
I did not, however test the node ordering to ensure that the tree stays balanced
Tbh, I'm at a loss. The one thing I'm hoping is that Deterministic Time Slicing will free us of these performance cliffs.
Acc to wikipedia the RB tree operations should be worst case 0(log n), so I'm not sure what could be going wrong unless there's a bug.
Is it possible you've got some expensive debug code still enabled by mistake, perhaps producing large text representation of the tree?
I do have the Debug code enabled and running (both locally and when deployed to the main network.
When I run this locally this is the logging I'm seeing
[Canister xcrtx-5aaaa-aaaaa-aahia-cai] creating new comment canister with pk=comment_2021-07-15
[Canister xcrtx-5aaaa-aaaaa-aahia-cai] index prinicpal = xcrtx-5aaaa-aaaaa-aahia-cai
[Canister xcrtx-5aaaa-aaaaa-aahia-cai] index prinicpal = xcrtx-5aaaa-aaaaa-aahia-cai
[Canister xzups-hyaaa-aaaaa-aahkq-cai] received comments of size=235
[Canister xcrtx-5aaaa-aaaaa-aahia-cai] index prinicpal = xcrtx-5aaaa-aaaaa-aahia-cai
[Canister xcrtx-5aaaa-aaaaa-aahia-cai] index prinicpal = xcrtx-5aaaa-aaaaa-aahia-cai
[Canister xzups-hyaaa-aaaaa-aahkq-cai] received comments of size=500
[Canister xcrtx-5aaaa-aaaaa-aahia-cai] index prinicpal = xcrtx-5aaaa-aaaaa-aahia-cai
[Canister xcrtx-5aaaa-aaaaa-aahia-cai] index prinicpal = xcrtx-5aaaa-aaaaa-aahia-cai
[Canister xzups-hyaaa-aaaaa-aahkq-cai] received comments of size=500
[Canister xcrtx-5aaaa-aaaaa-aahia-cai] index prinicpal = xcrtx-5aaaa-aaaaa-aahia-cai
[Canister xcrtx-5aaaa-aaaaa-aahia-cai] index prinicpal = xcrtx-5aaaa-aaaaa-aahia-cai
[Canister xzups-hyaaa-aaaaa-aahkq-cai] received comments of size=500
[Canister xcrtx-5aaaa-aaaaa-aahia-cai] index prinicpal = xcrtx-5aaaa-aaaaa-aahia-cai
[Canister xcrtx-5aaaa-aaaaa-aahia-cai] index prinicpal = xcrtx-5aaaa-aaaaa-aahia-cai
[Canister xzups-hyaaa-aaaaa-aahkq-cai] received comments of size=500
[Canister xcrtx-5aaaa-aaaaa-aahia-cai] index prinicpal = xcrtx-5aaaa-aaaaa-aahia-cai
[Canister xcrtx-5aaaa-aaaaa-aahia-cai] index prinicpal = xcrtx-5aaaa-aaaaa-aahia-cai
[Canister xzups-hyaaa-aaaaa-aahkq-cai] received comments of size=500
[Canister xcrtx-5aaaa-aaaaa-aahia-cai] index prinicpal = xcrtx-5aaaa-aaaaa-aahia-cai
[Canister xcrtx-5aaaa-aaaaa-aahia-cai] index prinicpal = xcrtx-5aaaa-aaaaa-aahia-cai
[Canister xzups-hyaaa-aaaaa-aahkq-cai] received comments of size=500
[Canister xcrtx-5aaaa-aaaaa-aahia-cai] index prinicpal = xcrtx-5aaaa-aaaaa-aahia-cai
[Canister xcrtx-5aaaa-aaaaa-aahia-cai] index prinicpal = xcrtx-5aaaa-aaaaa-aahia-cai
[Canister xzups-hyaaa-aaaaa-aahkq-cai] received comments of size=500
[Canister xcrtx-5aaaa-aaaaa-aahia-cai] index prinicpal = xcrtx-5aaaa-aaaaa-aahia-cai
[Canister xcrtx-5aaaa-aaaaa-aahia-cai] index prinicpal = xcrtx-5aaaa-aaaaa-aahia-cai
[Canister xzups-hyaaa-aaaaa-aahkq-cai] received comments of size=500
[Canister xcrtx-5aaaa-aaaaa-aahia-cai] index prinicpal = xcrtx-5aaaa-aaaaa-aahia-cai
[Canister xcrtx-5aaaa-aaaaa-aahia-cai] index prinicpal = xcrtx-5aaaa-aaaaa-aahia-cai
[Canister xzups-hyaaa-aaaaa-aahkq-cai] received comments of size=500
[Canister xcrtx-5aaaa-aaaaa-aahia-cai] index prinicpal = xcrtx-5aaaa-aaaaa-aahia-cai
[Canister xcrtx-5aaaa-aaaaa-aahia-cai] index prinicpal = xcrtx-5aaaa-aaaaa-aahia-cai
[Canister xzups-hyaaa-aaaaa-aahkq-cai] received comments of size=500
[Canister xcrtx-5aaaa-aaaaa-aahia-cai] index prinicpal = xcrtx-5aaaa-aaaaa-aahia-cai
[Canister xcrtx-5aaaa-aaaaa-aahia-cai] index prinicpal = xcrtx-5aaaa-aaaaa-aahia-cai
[Canister xzups-hyaaa-aaaaa-aahkq-cai] received comments of size=500
[Canister xcrtx-5aaaa-aaaaa-aahia-cai] index prinicpal = xcrtx-5aaaa-aaaaa-aahia-cai
[Canister xcrtx-5aaaa-aaaaa-aahia-cai] index prinicpal = xcrtx-5aaaa-aaaaa-aahia-cai
[Canister xzups-hyaaa-aaaaa-aahkq-cai] received comments of size=500
[Canister xcrtx-5aaaa-aaaaa-aahia-cai] index prinicpal = xcrtx-5aaaa-aaaaa-aahia-cai
[Canister xcrtx-5aaaa-aaaaa-aahia-cai] index prinicpal = xcrtx-5aaaa-aaaaa-aahia-cai
[Canister xzups-hyaaa-aaaaa-aahkq-cai] received comments of size=500
[Canister xcrtx-5aaaa-aaaaa-aahia-cai] index prinicpal = xcrtx-5aaaa-aaaaa-aahia-cai
[Canister xcrtx-5aaaa-aaaaa-aahia-cai] index prinicpal = xcrtx-5aaaa-aaaaa-aahia-cai
[Canister xzups-hyaaa-aaaaa-aahkq-cai] received comments of size=500
[Canister xcrtx-5aaaa-aaaaa-aahia-cai] index prinicpal = xcrtx-5aaaa-aaaaa-aahia-cai
[Canister xcrtx-5aaaa-aaaaa-aahia-cai] index prinicpal = xcrtx-5aaaa-aaaaa-aahia-cai
[Canister xzups-hyaaa-aaaaa-aahkq-cai] received comments of size=500
[Canister xcrtx-5aaaa-aaaaa-aahia-cai] index prinicpal = xcrtx-5aaaa-aaaaa-aahia-cai
[Canister xcrtx-5aaaa-aaaaa-aahia-cai] index prinicpal = xcrtx-5aaaa-aaaaa-aahia-cai
[Canister xzups-hyaaa-aaaaa-aahkq-cai] received comments of size=500
...
... keeps going for each new message upload
Where received comments of size=<xxx>
is what's being run for each upload batch (message)
I noticed that the implementation of put does an ignore so it won't wait for the update to complete. Is that deliberate? Could that somehow flood the canister with puts that don't get throttled by waiting for them.
BTW, since dfx 0.10.0, you should be able to run on a local replica with accurate cycle accounting and budgets, at least according to https://forum.dfinity.org/t/dfx-0-10-0-is-promoted/12870?u=claudio.
Have you tried?
I noticed that the implementation of put does an ignore so it won't wait for the update to complete. Is that deliberate? Could that somehow flood the canister with puts that don't get throttled by waiting for them.
Definitely a possibility and potential improvement. However, in my initial report,
"I even made a special csv with just 1 row at this point and tried to insert it into the Red-Black Tree on this 484MB heap sized canister and still received the same error"
It could still definitely be the case that for the single row, the 3 subsequent calls at this point overwhelm the replica, but then why was I able to insert 100 rows in the prior round with this same code and no issue?
BTW, since dfx 0.10.0, you should be able to run on a local replica with accurate cycle accounting and budgets, at least according to https://forum.dfinity.org/t/dfx-0-10-0-is-promoted/12870?u=claudio.
Have you tried?
I am using 0.10.0, and made a very weak attempt, but have not successfully gotten this working. Will try again getting it working in the coming week.
Also, according to the definition of Ignore https://internetcomputer.org/docs/current/developer-docs/build/languages/motoko/language-manual#ignore
The expression ignore
<exp>
evaluates<exp>
(typically for some side-effect) but discards its value.Ignore is useful for evaluating an expression within a sequence of declarations when that expression has non-unit type (and the simpler
declaration would be ill-typed). Then the semantics is equivalent to let _ = <exp> : Any
.
This makes is sound like ignore would evaluate the full expression first (including if there was any asynchronous behavior), and then discard it.
Still, you make a good point that the following three versions of this function could have different behavior in terms of how they are executed by the caller
// 1)
public func put(db: DB, options: PutOptions): async () {
ignore replace(db, options);
}
// 2)
public func put(db: DB, options: PutOptions): async () {
ignore await replace(db, options)
}
// 3)
public func put(db: DB, options: PutOptions): async () {
let _ = await replace(db, options);
}
I'm assuming you'd recommend option/version 3, but I'm curious how 2 vs. 3 would be interpreted (i.e. for 2, would ignore await on the inner expression to execute, essentially making 2 & 3 the same?)
2 and 3 are essentially equivalent and force the put to wait until the (async) replace has completed before returning to the caller.
1 differs from 2 and 3 in just initiating the replace, but not waiting for it to complete before returning (and discarding both the result and any errors thrown by replace).
In words:
1 says queue the replace message or function call and return immediately, without waiting for that call to complete or propagating its errors.
2 and 3 say queue the replace message and wait for the replace to complete, discarding the result but propagating any errors.
@ByronBecker Did this get resolved? I am certain it is the GC that causes the limit. At the heap size that you mentioned there are most likely sufficiently many objects on the heap. The GC touches them all and doing so reaches the cycle limit. You reached a point where GC is going to be triggered by the very next message, even if the message is short (a single insertion). It has nothing to do with the data structure. You will hit the same limit if you store your data in an array or some other way.
@timohanke
I've been communicating with the Motoko team over the past 8 months or so with the goal of to resolving these issues. Deterministic Time Slicing is helping a lot to kick limitations down the road, and @luc-blaeser is doing some great work to improve the GC with his generational GC and an in-progress incremental GC.
We've come a ways since I first ran into this during Supernova, but there are still limits and I'm hopeful that DTS limits being raised and the new incremental GC will get us to 4GB (heap max)
@matthewhammer @crusso
I've recently moved from doing local testing to main network testing, and one of the large difference between testing locally and on the main network are limitations such as the instruction limit for single message execution.
The issue
On the main network, at a certain point (less than the canister heap size or the canister memory size), Red-Black Trees are unable to insert more records and hit the instruction limit for single message execution.
The error.
Once I have around 290-500MB (heap size) memory in my canister, insertions of ~30-80KB per message will randomly start to return the following error
What I was doing at the time of the error
For example,
At this heap utilization, I was chunk uploading 100 rows at a time (30-80KB), and then inserting in the manner shown in the "Some Context" section below. All of a sudden, I received the error shown in the "The issue" section above, and was no longer able to perform insertions into the Red-Black Tree without receiving that error. I was able to upgrade the canister and update other stable variables, but nothing more could be inserted into the Red-Black tree.
I even made a special csv with just 1 row at this point and tried to insert it into the Red-Black Tree on this 484MB heap sized canister and still received the same error
The script I run
My guess is that this could either be due to inefficiencies in re-balancing the Red-Black Tree, or that the tree has gotten so deep that recursive operations are much more costly by (more than linear factor) due to the call stack or some other factor.
Some More Context
My database project, CanDB, is a Red-Black tree of type
<Text, AttributeValue>
whereAttributeValue
is a stable primitive.I'm testing out backfill-ing a large amount of data into CanDB, from which ~80KB csv chunks are formatted and uploaded to the following API endpoint,
processComments()
.(For reference, the awaits in the
CanDB.put()
method are only ever called once in a canister's lifetime, and this is for scaling out a new canister. This means that everyCanDB.put()
method is synchronous except in the one time it needs to signal that it needs to be scaled out)If you are interested in the code for
CanDB.put()
, you can find it here, and theRangeTree
data structure that sits on top of a Red-Black as well.Recommendations
I'm curious if digging into the recursive (performance) aspects of the functional data structures in motoko-base might find "easy-win" areas of improvement to avoid the message execution instruction limit.
Also, in addition to the local testing, it would be a good idea to do some main network testing with respect to the performance of imperative, mutable data structure implementations (Red-Black Tree, BTree) against their functional counterparts.
My takeaways
While I enjoy many of the functional aspects and guarantees of the implementations, if I'm releasing a database product the bottom line for me a data structure that provides consistent performance.
In the coming weeks, I'm also going to test the limitations a bit more, as well as look to see if I can make certain optimizations like putting a short hash table or Trie at the top that might reduce any rebalancing or stack depth issues.