Open rawmarshmellows opened 7 years ago
Hi @kevinlu1211,
Thanks for the reporting!
I have not run the tensor2tensor code. I am also curious about the memory usage of tensor2tensor with batch size equals 1024. Do you have the number? If the memory usage difference is large, there may exist a memory efficiency problem.
Thanks, Yu-Hsiang
What do you mean by number?
On Tue, 15 Aug 2017 at 6:13 pm, Victor Huang notifications@github.com wrote:
Hi @kevinlu1211 https://github.com/kevinlu1211,
Thanks for the reporting!
I have not run the tensor2tensor code. I am also curious about the memory usage of tensor2tensor with batch size equals 1024. Do you have the number? If the memory usage difference is large, there may exist a memory efficiency problem.
Thanks, Yu-Hsiang
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/jadore801120/attention-is-all-you-need-pytorch/issues/24#issuecomment-322406599, or mute the thread https://github.com/notifications/unsubscribe-auth/AME8D-idcHeCkfIfU_YIiYGb41sLypvSks5sYVMegaJpZM4O3LsR .
Sorry for the ambiguity. I mean the memory usage of the tensor 2 tensor project.
2017年8月15日 下午4:40,"kevinlu1211" notifications@github.com寫道:
What do you mean by number?
On Tue, 15 Aug 2017 at 6:13 pm, Victor Huang notifications@github.com wrote:
Hi @kevinlu1211 https://github.com/kevinlu1211,
Thanks for the reporting!
I have not run the tensor2tensor code. I am also curious about the memory usage of tensor2tensor with batch size equals 1024. Do you have the number? If the memory usage difference is large, there may exist a memory efficiency problem.
Thanks, Yu-Hsiang
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/jadore801120/attention-is-all- you-need-pytorch/issues/24#issuecomment-322406599, or mute the thread https://github.com/notifications/unsubscribe-auth/AME8D-idcHeCkfIfU_ YIiYGb41sLypvSks5sYVMegaJpZM4O3LsR .
— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/jadore801120/attention-is-all-you-need-pytorch/issues/24#issuecomment-322413173, or mute the thread https://github.com/notifications/unsubscribe-auth/ADxwKjwL9YN8rKt_DIqhRAOiv64IwZi4ks5sYVmKgaJpZM4O3LsR .
It uses around ~10GB RAM for a batch size of 1024
On Tue, Aug 15, 2017 at 6:43 PM, Victor Huang notifications@github.com wrote:
Sorry for the ambiguity. I mean the memory usage of the tensor 2 tensor project.
2017年8月15日 下午4:40,"kevinlu1211" notifications@github.com寫道:
What do you mean by number?
On Tue, 15 Aug 2017 at 6:13 pm, Victor Huang notifications@github.com wrote:
Hi @kevinlu1211 https://github.com/kevinlu1211,
Thanks for the reporting!
I have not run the tensor2tensor code. I am also curious about the memory usage of tensor2tensor with batch size equals 1024. Do you have the number? If the memory usage difference is large, there may exist a memory efficiency problem.
Thanks, Yu-Hsiang
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/jadore801120/attention-is-all- you-need-pytorch/issues/24#issuecomment-322406599, or mute the thread https://github.com/notifications/unsubscribe-auth/AME8D-idcHeCkfIfU_ YIiYGb41sLypvSks5sYVMegaJpZM4O3LsR .
— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/jadore801120/attention-is-all-you-need- pytorch/issues/24#issuecomment-322413173, or mute the thread https://github.com/notifications/unsubscribe-auth/ ADxwKjwL9YN8rKt_DIqhRAOiv64IwZi4ks5sYVmKgaJpZM4O3LsR .
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/jadore801120/attention-is-all-you-need-pytorch/issues/24#issuecomment-322413750, or mute the thread https://github.com/notifications/unsubscribe-auth/AME8D4uH_tQRR5A4iZXl9yeOkH_HZpMSks5sYVoqgaJpZM4O3LsR .
It might be that the definition of batch_size
is different for these two projects?
batch_size: int, total number of tokens in a batch.
Hi I was wondering why the maximum batch size is ~100 using a GPU with ~11GB of RAM whereas in the tensor2tensor the maximum batch size there is 1024?