Open KexinFeng opened 3 months ago
Can you send me your demo-config.json?
PS: the original demo-config.json is just a demo. You need to modify the content to generate a tree you want.
{ "acceptance_rate_vector": "acceptance-rate-vector.pt", "max_depth": 15, "max_budget": 128, "draft_time": 0.0003, "valid_budget": [1, 2, 4, 8, 16, 32, 64, 128], "target_time":[0.025, 0.025, 0.025, 0.025, 0.025, 0.027, 0.030, 0.035], "dst": "demo_tree.pt" }
p = [0.0000, 0.4803, 0.1104, 0.0576, 0.0373, 0.0265, 0.0211, 0.0170, 0.0135, 0.0113, 0.0093, 0.0087, 0.0075, 0.0067, 0.0058, 0.0061, 0.0049] might be a proper example to generate a tree of size of 32.
Some explanation: draft_time is the time for one draft model's forward pass. target_time is the time for one draft model's forward pass corresponding to the valid budget.
Thanks for the fast reply and the explanation! The demo-config.json
is the same as that in the repo.
{
"acceptance_rate_vector": "acceptance-rate-vector.pt",
"max_depth": 10,
"max_budget": 128,
"draft_time": 0.38,
"valid_budget": [1, 2, 4, 8, 16, 32, 64],
"target_time":[10, 10, 10, 12, 14, 18, 27],
"dst": "demo_tree.pt"
}
The time above was assumed to be in the unit of ms
.
acceptance_rate_vector:
tensor([0.0000, 0.6342, 0.1079, 0.0570, 0.0225, 0.0195, 0.0150, 0.0045, 0.0030, 0.0120, 0.0045, 0.0075, 0.0045, 0.0060, 0.0030, 0.0015, 0.0030, 0.0015, 0.0030, 0.0000, 0.0030, 0.0030, 0.0030, 0.0000, 0.0015, 0.0000, 0.0015, 0.0000, 0.0000, 0.0000, 0.0015, 0.0015, 0.0015, 0.0735])
which is similar to the acceptance vector in the repo too.
I can first try your demo-config.json
and the acceptance_vec above too. It seems that our acceptance vec sizes are also different. About the times in the config, I previously assume that the unit in the numbers are not important; i.e. if we simultaneously scale the draft_time and the target_time by the same multiplicity, the resultant tree is invariant. I don't know if this is the correct assumption.
Here is some updates. I tried your config.json
with my original acc_rate_vec.pt
, the generated tree becomes normal, of size 32.
From the ablation test, it looks like the target time [10, 10, 10, 12, 14, 18, 27]
(unit ms) I used is the key reason that caused the optimal tree to be of size 4. It looks like the algo is pretty sensitive to the target_time profile. In contrast to the above target_time profile, [ 11, 11, 11, 11, 11, 20, 31]
(unit ms) generate tree size 16. The two are not quite different, yet the resultant tree sizes vary a lot.
By the way, is it true that the generated tree sizes can only be numbers from "valid_budget": [1, 2, 4, 8, 16, 32, 64] ? And to allow for more tree size numbers, in the config.json, the data points of more valid_budgets have to be provided there?
Yes, the generated tree sizes can only be numbers from "valid_budget".
How to determine the optimal depth and budget? Can you share the config.json
that generated the example A100 and L40 growmaps in this repository?
Some explanation: draft_time is the time for one draft model's forward pass. target_time is the time for one draft model's forward pass corresponding to the valid budget.
Is there a typo here? Shouldn't target_time
be the time for a verification pass?
Some explanation: draft_time is the time for one draft model's forward pass. target_time is the time for one draft model's forward pass corresponding to the valid budget.
Is there a typo here? Shouldn't
target_time
be the time for a verification pass?
I have the same question.
"draft_time": 0.0003
What is the unit of this value? I don't think any model can complete the forward pass within 0.0003 seconds?
Hi,
I was trying to reproduce the numbers in the paper, but with the
demo-config.json
, plus the acceptance vector in the repo or the acceptance vector I tested myself, the generated trees are all very small and somewaht fixed:But on the other hand, the growmaps in the two folders are generally very large, typically of size 128, 64, 32. Do you know what the possible reason is that the tree I generated is small and how to reproduce the growmaps in those two folders?
Thank you!