oswaldoludwig / Seq2seq-Chatbot-for-Keras

This repository contains a new generative model of chatbot based on seq2seq modeling.
Apache License 2.0
331 stars 98 forks source link

how a new vocabulary file can be generated? #30

Closed Asrix-AI closed 4 years ago

Asrix-AI commented 4 years ago

can you please explain how you generated your vocabulary file?

oswaldoludwig commented 4 years ago

https://github.com/oswaldoludwig/Seq2seq-Chatbot-for-Keras/issues/4

oswaldoludwig commented 4 years ago

Just be careful of the index of the special symbols, BOS and EOS, their positions/indexes are fixed in the dictionary.

Asrix-AI commented 4 years ago

Thank you.

Asrix-AI commented 4 years ago

can you please explain why this error is occurring? Traceback (most recent call last): File "conversation.py", line 215, in Q = tokenize(query) File "conversation.py", line 117, in tokenize X = np.asarray([word_to_index[w] for w in tokenized_sentences]) File "conversation.py", line 117, in X = np.asarray([word_to_index[w] for w in tokenized_sentences]) KeyError: 'something'

oswaldoludwig commented 4 years ago

It looks like your dictionary doesn't have the key "something". You must use a preprocessing function that exists somewhere in my code that replaces all non-dictionary words with a special token, such as UNK.