nomic-ai / pygpt4all

Official supported Python bindings for llama.cpp + gpt4all
https://nomic-ai.github.io/pygpt4all/
MIT License
1.02k stars 162 forks source link

GPT4j-v1.3-groovy generates not useful output #88

Closed jav-ed closed 1 year ago

jav-ed commented 1 year ago

Please have a look at the following code and then at the generated output:

from pygpt4all.models.gpt4all_j import GPT4All_J

model = "/home/jav/.local/share/nomic.ai/GPT4All/ggml-gpt4all-j-v1.3-groovy.bin"

def new_text_callback(text):
    print(text, end="")

model = GPT4All_J(model)

model.generate("What is the captial of Pakistan", 
               new_text_callback=new_text_callback)

Output:

gptj_model_load: loading model from '/home/jav/.local/share/nomic.ai/GPT4All/ggml-gpt4all-j-v1.3-groovy.bin' - please wait ...
gptj_model_load: n_vocab = 50400
gptj_model_load: n_ctx   = 2048
gptj_model_load: n_embd  = 4096
gptj_model_load: n_head  = 16
gptj_model_load: n_layer = 28
gptj_model_load: n_rot   = 64
gptj_model_load: f16     = 2
gptj_model_load: ggml ctx size = 4505.45 MB
gptj_model_load: memory_size =   896.00 MB, n_mem = 57344
gptj_model_load: ................................... done
gptj_model_load: model size =  3609.38 MB / num tensors = 285
gptj_generate: seed = 1682710517
gpt_tokenize: unknown token ' '
gpt_tokenize: unknown token 'i'
gpt_tokenize: unknown token 's'
gpt_tokenize: unknown token ' '
gpt_tokenize: unknown token 't'
gpt_tokenize: unknown token 'h'
gpt_tokenize: unknown token 'e'
gpt_tokenize: unknown token 'i'
gpt_tokenize: unknown token 'a'
gpt_tokenize: unknown token 'l'
gpt_tokenize: unknown token ' '
gptj_generate: number of tokens in prompt = 4

What captof Pakistan[2023-04-28 21:35:19,772] {model.py:85} WARNING - UnicodeDecodeError of bytes b'\xad'
ahan[2023-04-28 21:35:20,399] {model.py:85} WARNING - UnicodeDecodeError of bytes b'\xad'
nd();[2023-04-28 21:35:21,349] {model.py:85} WARNING - UnicodeDecodeError of bytes b'\xad'
.) Shah[2023-04-28 21:35:22,317] {model.py:85} WARNING - UnicodeDecodeError of bytes b'\xad'
se Raza wason former Pakistan Army officer who batter we dominated in the 9th asserts Brigade in the Pakistan Army[2023-04-28 21:35:29,996] {model.py:85} WARNING - UnicodeDecodeError of bytes b'\xad'
ish on parton former member in the Pakistanmodified Partyndwa[2023-04-28 21:35:34,606] {model.py:85} WARNING - UnicodeDecodeError of bytes b'\xad'
[2023-04-28 21:35:34,947] {model.py:85} WARNING - UnicodeDecodeError of bytes b'\xad'
 and batter in various positions in the exist[2023-04-28 21:35:38,064] {model.py:85} WARNING - UnicodeDecodeError of bytes b'\xad'
 including the Federal Minister for the Environment[2023-04-28 21:35:40,668] {model.py:85} WARNING - UnicodeDecodeError of bytes b'\xad'
 the Federal Minister for Information[2023-04-28 21:35:42,577] {model.py:85} WARNING - UnicodeDecodeError of bytes b'\xad'
 the Federal Minister for Defense Production and the President in Pakistan Sports Board[2023-04-28 21:35:47,128] {model.py:85} WARNING - UnicodeDecodeError of bytes b'\xad'
 Canad retiring have the army[2023-04-28 21:35:49,046] {model.py:85} WARNING - UnicodeDecodeError of bytes b'\xad'
 it part batter weon Member in the Provincial Assembly for Kh[2023-04-28 21:35:53,253] {model.py:85} WARNING - UnicodeDecodeError of bytes b'\xad'
ber Pakhtunk[2023-04-28 21:35:54,922] {model.py:85} WARNING - UnicodeDecodeError of bytes b'\xad'
wa have the Pwa[2023-04-28 21:35:56,899] {model.py:85} WARNING - UnicodeDecodeError of bytes b'\xad'
<|endoftext|>

gptj_generate: mem per token = 15478000 bytes
gptj_generate:     load time =     0.00 ms
gptj_generate:   sample time =    25.30 ms
gptj_generate:  predict time = 38509.57 ms / 323.61 ms per token
gptj_generate:    total time = 39714.46 ms

The answer should be one word: Islamabad - why is it not working?

abdeladim-s commented 1 year ago

Hi @jav-ed,

It is the same issue as #76 Please update the package and give it a try now ?

I believe the issue should be solved. (ensure you have pygptj version 1.0.10).

jav-ed commented 1 year ago

@abdeladim-s amazing - thank you. It is working using pygptj version 1.0.10

abdeladim-s commented 1 year ago

You are welcome @jav-ed :)

azmainamin commented 1 year ago

@abdeladim-s Does it mean I should uninstall pygptj Version 2.0,3 and install 1.0.10?

azmainamin commented 1 year ago

If I do that, I get the following error: image