Open udbhav-44 opened 5 months ago
I also try to run a query face the same problem, but the system only shows "Setting pad_token_id
to eos_token_id
:128001 for open-end generation.", have you ever solve the problem yet, pls help.
I got the same message and the query takes forever... Any explanation of the error and if it has influence on the query results?
I find the problem is, this author build the program in serial, instead of parallel, while you compile run_localGPT, you can also monitor you CPU usage(by top, or htop instructions). In my aspect, I only utilize 1~2 cpu cores to run the program, that’s the reason why it run so slow.
On Jul 25, 2024, at 4:20 PM, KansaiTraining @.***> wrote:
I got the same message and the query takes forever... Any explanation of the error and if it has influence on the query results?
— Reply to this email directly, view it on GitHub https://github.com/PromtEngineer/localGPT/issues/813#issuecomment-2249743173, or unsubscribe https://github.com/notifications/unsubscribe-auth/BJC5EIKMTLAJJ6K2KGHHCG3ZOCYMJAVCNFSM6AAAAABKDZOUTCVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDENBZG42DGMJXGM. You are receiving this because you commented.
Same issue here... I also see my SSD reading a lot because of python 3.10, even after getting :
Truncation was not explicitly activated but max_length
is provided a specific value, please use truncation=True
to explicitly truncate examples to max length. Defaulting to 'longest_first' truncation strategy. If you encode pairs of sequences (GLUE-style) with the tokenizer you can select this strategy more precisely by providing a specific strategy to truncation
.
Setting pad_token_id
to eos_token_id
:128001 for open-end generation.
Has anyone found a solution?
I have the same issue, any help?
I get this error when i Try to run a query
Truncation was not explicitly activated but
max_length
is provided a specific value, please usetruncation=True
to explicitly truncate examples to max length. Defaulting to 'longest_first' truncation strategy. If you encode pairs of sequences (GLUE-style) with the tokenizer you can select this strategy more precisely by providing a specific strategy totruncation
. Settingpad_token_id
toeos_token_id
:128001 for open-end generation. C:\Users\Tarun Sridhar.conda\envs\mummy\lib\site-packages\transformers\models\llama\modeling_llama.py:648: UserWarning: 1Torch was not compiled with flash attention. (Triggered internally at ..\aten\src\ATen\native\transformers\cuda\sdp_utils.cpp:455.) attn_output = torch.nn.functional.scaled_dot_product_attention(What can be possible fixes?