Closed BangDaeng closed 2 years ago
@byshiue
Is there a part where you set the seed... Could you tell me which c++ script it is in?
I'm trying to modify
The seed is set in the ParallelGpt.cc constructor.
Latest update in main branch has moved the top k to the runtime input. Note that when you set different top k in one batch, the speed would become slower.
Hi @byshiue, I found from the guide of GPT that model inputs like top_k or random_seed support tensor type:
(xiii. Random_seed [1] or [batch_size, 1] on cpu, optional
)
However, when I tried with the gpt python example and replaced random_seed with an IntTensor, it gave me the following error:
[INFO] batch size: 4
[WARNING] gemm_config.in is not found; using default GEMM algo
[WARNING] gemm_config.in is not found; using default GEMM algo
Traceback (most recent call last):
File "../examples/pytorch/gpt/gpt_example.py", line 245, in <module>
main()
File "../examples/pytorch/gpt/gpt_example.py", line 167, in main
tokens_batch = gpt(start_ids,
File "/opt/conda/lib/python3.8/site-packages/torch/nn/modules/module.py", line 744, in _call_impl
result = self.forward(*input, **kwargs)
File "/workspace/FasterTransformer/examples/pytorch/gpt/../../../examples/pytorch/gpt/utils/gpt.py", line 299, in forward
outputs = self.model.forward(start_ids,
RuntimeError: forward() Expected a value of type 'int' for argument '_11' but instead found type 'Tensor'.
Position: 11
Value: tensor([11956, 60550, 46756, 76151], dtype=torch.int32)
Declaration: forward(__torch__.torch.classes.FasterTransformer.GptOp _0, Tensor _1, Tensor _2, int _3, int _4, int _5, float _6, float _7, float _8, float _9, float _10, int _11, int _12) -> (Tensor[] _0)
Cast error details: Unable to cast Python instance to C++ type (compile in debug mode for details)
Looks like the random_seed argument only accepts int type.
Is there something I misunderstand about the guide?
For pytorch op, it only supports scalar now.
In latest release, FT gpt pytorch op has supported vectored topk and random seed.
Close this bug because it is inactivated. Feel free to re-open this issue if you still have any problem.
Hello, first of all, thank you for creating this library. I have 2 questions.
First question, I saw this guide and successfully started Triton Server.
and Here is my request python file
and here is my config.pbtxt
I'm going to use this python script. I changed it to produce 10 outputs for the same 10 batch size inputs. I would like to get 10 different result values with Topk Sampling applied to each inputs. But, I get 10 same output result bellow image(not applyed topk sample each inpus)
I want get 10 different result like below image
How can i get 10 difference output? (not using dynamic batching)
Second question, Why this dev/v5.0_beta isn't it officially released? I really want to use it, but I'm curious what kind of defects are left.