unable to allocate memory in function AllocateCudaBuffersout of memoryE0331 16:32:37.715184 45 dynamic_batch_scheduler.cc:162]

kevinmgyu commented 2 years ago

**model config*** encoder layers: 6 decoder layers: 6 hidden size: 1024 inner size: 4096 head number: 16 dim per head: 64 src vocab size: 40480 trg vocab size: 42720 is_post_ln: 0 no_scale_embedding: 0 use_gelu: 0 start_id: 2 end_id: 6 padding_id: 2 multilg_type: 0

generator config beam size: 4 max step: 1024 extra decode length(max decode length - src input length): 50 length penalty: 0.6 diverse lambda: 0 sampling method: beam_search topk: 1 topp: 0.75 unable to allocate memory in function AllocateCudaBuffersout of memoryE0331 16:32:37.715184 45 dynamic_batch_scheduler.cc:162] Initialization failed for dynamic-batch scheduler thread 3: initialize error for 'transformer_server': (12) cudaMalloc failed I0331 16:32:43.967144 45 server.cc:400] Polling model repository I0331 16:32:58.967572 45 server.cc:400] Polling model repository

cuda 10.1 版本编译的libtransformer_server.so cuda 10.1 gpu上运行的

config.pbtxt的文件如下： name: "transformer_server" platform: "custom" max_batch_size: 1024 default_model_filename: "libtransformer_server.so" input [ { name: "src_ids:0" data_type: TYPE_INT32 dims: [ -1 ] } ] output [ { name: "trg_ids:0" data_type: TYPE_INT32 dims: [-1,-1,-1] } ] instance_group [ { count: 1 } ]

请问这个如何解决

kevinmgyu commented 2 years ago

想问下是我上边的配置文件写的有错误吗？就是transformer翻译场景的输入和输出

kevinmgyu commented 2 years ago

显卡足够大

模型就是不到1G

kevinmgyu commented 2 years ago

改成11.6的cuda编译，升级了cuda驱动到11.6，启动正常了，但是这个1g的模型居然占据的显卡这么大

用triton client发起请求报下边错误

Taka152 commented 2 years ago

max_step and max_batch_size will influence GPU memory usage.

On Thu, Mar 31, 2022 at 6:07 PM kevinmgyu @.***> wrote:

改成11.6的cuda编译，升级了cuda驱动到11.6，启动正常了，但是这个1g的模型居然占据的显卡这么大 [image: image] https://user-images.githubusercontent.com/42665912/161030947-280c1418-ee4d-4815-a37e-cc73325ceb37.png

用triton client发起请求报下边错误 [image: image] https://user-images.githubusercontent.com/42665912/161031146-c37bbde7-1721-4387-80ca-8856a7c9fe24.png

— Reply to this email directly, view it on GitHub https://github.com/bytedance/lightseq/issues/287#issuecomment-1084364722, or unsubscribe https://github.com/notifications/unsubscribe-auth/AELIZAPDO2TVNSV4LJFKU23VCV2OZANCNFSM5SETEJGA . You are receiving this because you are subscribed to this thread.Message ID: @.***>

kevinmgyu commented 2 years ago

max_step and max_batch_size will influence GPU memory usage. … On Thu, Mar 31, 2022 at 6:07 PM kevinmgyu @.> wrote: 改成11.6的cuda编译，升级了cuda驱动到11.6，启动正常了，但是这个1g的模型居然占据的显卡这么大 [image: image] https://user-images.githubusercontent.com/42665912/161030947-280c1418-ee4d-4815-a37e-cc73325ceb37.png 用triton client发起请求报下边错误 [image: image] https://user-images.githubusercontent.com/42665912/161031146-c37bbde7-1721-4387-80ca-8856a7c9fe24.png — Reply to this baemail directly, view it on GitHub <#287 (comment)>, or unsubscribe https://github.com/notifications/unsubscribe-auth/AELIZAPDO2TVNSV4LJFKU23VCV2OZANCNFSM5SETEJGA . You are receiving this because you are subscribed to this thread.Message ID: @.>

请问下这个是什么问题引起的？？输入的数据就是这个简单测试下配置是这样的

Taka152 commented 2 years ago

You can build with debug mode to check.

kevinmgyu commented 2 years ago

You can build with debug mode to check.

上边编译lightseq就是用的debug模式编译的，但是没有更多提示信息了

Taka152 commented 2 years ago

check here to use lightseq debug mode. https://github.com/bytedance/lightseq/blob/9a617306fa/docs/inference/build.md

bytedance / lightseq

unable to allocate memory in function AllocateCudaBuffersout of memoryE0331 16:32:37.715184 45 dynamic_batch_scheduler.cc:162] #287