Closed jxzhn closed 2 years ago
Hi! half_len
here denotes that the first half of the snippet will be processed as memory, and the last part is for loss eval and prediction, it seems there is no bug
But this is inconsistent with the comments on lines 87 to 104
"""
Handles training / evaluation on long ASTs by splitting
them into smaller ASTs of length max_len, with a sliding
window of max_len / 2.
Example: for an AST ast with length 1700, and max_len = 1000,
the output will be:
[[ast[0:1000], 0], [ast[500:1500], 1000], [ast[700:1700], 1500]]
Input:
ast : List[Dictionary]
List of nodes in pre-order traversal.
max_len : int
Output:
aug_asts : List[List[List, int]]
List of (ast, beginning idx of unseen nodes)
"""
Also inconsistent with lines 114 to 115
idx = max_len - (len(ast) - (i + half_len))
aug_asts.append([ast[-max_len:], idx])
Oh, I see, there is seem to be an issue with the comment. The function's intent is here https://github.com/facebookresearch/code-prediction-transformer#splitting-large-trees
The function seems to be correct, also it does not follow he comments on lines 87 to 104:
separate_dps(range(2022), 500)
should produce
[[range(0, 500), 0],
[range(250, 750), 250],
[range(500, 1000), 250],
[range(750, 1250), 250],
[range(1000, 1500), 250],
[range(1250, 1750), 250],
[range(1500, 2000), 250],
[range(1522, 2022), 478]]
0, 250, 478 values denote the start of the token block for which we make a prediction. 250 means that we should use the first 250 tokens in a block as memory. The first block range(0, 500) is used for prediction (no memory), for intermediate blocks the first half is used as memory, for the last block, which is range(1522, 2022) the model will use elements range(1522, 2000) as memory, and make predictions for the remaining elements: range(2000, 2022). I hope it is more clear, thank you @jxzhn !
get it, thanks!
I think line 112 (in function
separate_dps
) of cc/main/src/utils/utils.py should beinstead of