Open kroggen opened 1 year ago
Using gdb
and using Ctrl+C when it keeps printing empty lines gives this:
Thread 1 "llama2_q4" received signal SIGINT, Interrupt.
0x00007f5d5a1a06b0 in ?? () from /lib/x86_64-linux-gnu/libcuda.so.1
(gdb) bt
#0 0x00007f5d5a1a06b0 in ?? () from /lib/x86_64-linux-gnu/libcuda.so.1
#1 0x00007f5d59f077cf in ?? () from /lib/x86_64-linux-gnu/libcuda.so.1
#2 0x00007f5d5a27eb2f in ?? () from /lib/x86_64-linux-gnu/libcuda.so.1
#3 0x00007f5d5a27ae67 in ?? () from /lib/x86_64-linux-gnu/libcuda.so.1
#4 0x00007f5d59f1844f in ?? () from /lib/x86_64-linux-gnu/libcuda.so.1
#5 0x00007f5d5a023501 in ?? () from /lib/x86_64-linux-gnu/libcuda.so.1
#6 0x00007f5d5a25fcb6 in ?? () from /lib/x86_64-linux-gnu/libcuda.so.1
#7 0x00007f5d5a0b4b79 in ?? () from /lib/x86_64-linux-gnu/libcuda.so.1
#8 0x000055fcbc873880 in __cudart1073 ()
#9 0x000055fcbc8d1a78 in cudaStreamSynchronize ()
#10 0x000055fcbc868f84 in main (argc=<optimized out>, argv=<optimized out>)
at /root/llama_cu_awq/llama2_q4.cu:488
I have observed it too. I assumed it's due to the model generating a bunch of new line tokens but I had not verified. Are you able to reproduce this with the original cpu version (llama2.c)?
I don't remember of seeing it on llama2.c
Although I had seen something like this in the past, I am unable to reproduce with your prompts: Will take a deeper look if I run into this next time.
llama2_q4.exe Llama-2-7b-chat-awq.bin -i "how to use snprintf() in C?"
Model params:- dim: 4096 hidden_dim: 11008 n_heads: 32 n_kv_heads: 32 n_layers: 32 seq_len: 4096 vocab_size: 32000
Loading Weights... done!
how to use snprintf() in C?
I am trying to use the snprintf()
function in C to format a string, but I am getting a segmentation fault. Here is the code:
#include <stdio.h>
#include <string.h>
int main() {
char buffer[100];
int num = 42;
char *fmt = "The answer is %d";
int ret = snprintf(buffer, 100, fmt, num);
printf("%s\n", buffer);
return 0;
}
I am not sure what I am doing wrong, but I think it might be related to the fact that I am trying to format a string that is too long. Can someone please help me figure out what is going wrong?
Answer: The problem is that you are trying to format a string that is too long. The snprintf()
function will write the formatted string to the buffer
array, but it will not check if the array is large enough to hold the formatted string. If the array is too small, it will write beyond the end of the array, which is undefined behavior and can cause a segmentation fault.
To fix the problem, you can use the vsnprintf()
function, which is similar to snprintf()
, but it will check if the array is large enough to hold the formatted string. Here is an example of how you can use vsnprintf()
to format a string:
#include <stdio.h>
#include <string.h>
int main() {
char buffer[100];
int num = 42;
char *fmt = "The answer is %d";
int ret = vsnprintf(buffer, 100, fmt, num);
printf("%s\n", buffer);
return 0;
}
In this example, I used the vsnprintf()
function instead of snprintf()
, and I passed it the buffer
array and the fmt
string as arguments. The vsnprintf()
function will write the formatted string to the buffer
array, and it will check if the array is large enough to hold the formatted string. If the array is too small, it will return the number of characters written, which you can then use to determine how many characters were written to the buffer
array.
I hope this helps! Let me know if you have any questions.
achieved tok/s: 198.178506. Tokens: 544, seconds: 2.745
llama2_q4.exe Llama-2-13b-chat-awq.bin -i "what is an inverse square root?"
Model params:- dim: 5120 hidden_dim: 13824 n_heads: 40 n_kv_heads: 40 n_layers: 40 seq_len: 4096 vocab_size: 32000
Loading Weights... done!
what is an inverse square root?
I've been trying to understand this concept for a while now, but I just can't seem to wrap my head around it. Can someone please explain it to me in a way that makes sense?
Thank you!
Best regards, [Your Name]
Hi [Your Name],
I'd be happy to help you understand the concept of an inverse square root!
To start, let's define what we mean by "inverse" and "square root."
An inverse is a quantity that, when multiplied by another quantity, gives us the original quantity. For example, if we have the equation 2x = 6, then 1/2 is the inverse of 6, because 1/2 multiplied by 6 gives us 6.
A square root, on the other hand, is a quantity that, when multiplied by itself, gives us a given number. For example, the square root of 9 is 3, because 3 multiplied by 3 gives us 9.
Now, when we talk about an inverse square root, we're talking about a quantity that, when multiplied by itself, gives us the inverse of a given number. In other words, if we have a number x, then the inverse square root of x is a quantity y such that y multiplied by y gives us x.
To illustrate this concept, let's consider an example. Suppose we have the number 9. The inverse square root of 9 is 3, because 3 multiplied by 3 gives us 9.
So, to summarize:
I hope this explanation helps clarify the concept of an inverse square root for you! Let me know if you have any further questions.
Best regards, [Your Name]
achieved tok/s: 107.610945. Tokens: 468, seconds: 4.349
There is a bug in which it keeps printing blank lines in a loop
I was not able to discover the reason
It only happens on some prompts. Here is an example in which it happen (7B):
And an example with the 13B: