harrisonvanderbyl / rwkv-cpp-accelerated

A torchless, c++ rwkv implementation using 8bit quantization, written in cuda/hip/vulkan for maximum compatibility and minimum dependencies
MIT License
306 stars 19 forks source link

Endless <|endoftext|> bug #4

Closed Murugurugan closed 1 year ago

Murugurugan commented 1 year ago

I am not sure if it's a problem with tokenizer or what, but after model loads, it just spams <|endoftext|> endlessly. That's with 4090 and RWKV v10 7B running on Ubuntu. Doesn't matter what I put in the prompt.

harrisonvanderbyl commented 1 year ago

I havent tested it with 7B plus yet, though it should work, please try changing the batching sizes at the top of rwkv.cu and let me know if anything changes.

I am also currently working on a better sampler, as I am currently just using greedy

Murugurugan commented 1 year ago

I tried all kinds of values and combinations, but nothing has changed, it repeats the same thing.

howard0su commented 1 year ago

I have the same problem.

harrisonvanderbyl commented 1 year ago

this should be resolved in the latest version, can you confirm there is still issues with 7B ?

Murugurugan commented 1 year ago

It's starting to work, but it gave me this error this time:

File: /home/work/rwkv-cpp-cuda/samplers/NumCpp/NdArray/NdArrayIterators.hpp
        Function: NdArrayConstIterator
        Line: 70
        Error: NdArray has not been initialized.terminate called after throwing an instance of 'std::runtime_error'
  what():  File: /home/work/rwkv-cpp-cuda/samplers/NumCpp/NdArray/NdArrayIterators.hpp
        Function: NdArrayConstIterator
        Line: 70
        Error: NdArray has not been initialized.
Aborted (core dumped)
harrisonvanderbyl commented 1 year ago

that error has been fixed for 7B