getting error: - Githubissues

stevedipaola commented 4 years ago

installed via conda env with py3.6.9 and new pytorch / transformers on ubuntu v18

change to model = GPT2LMHeadModel.from_pretrained("gpt2") but got this error:

(poemGen) root314@sr-02631:~/poem-generator-master$ python poemmaker.py rhymes loaded model loaded starting prompt: love line length: 5 lines: 5 poemmaker.py:28: UserWarning: To copy construct from a tensor, it is recommended to use sourceTensor.clone().detach() or sourceTensor.clone().detach().requires_grad_(True), rather than torch.tensor(sourceTensor). sorted_indices_to_remove = torch.tensor(sorted_indices_to_remove, dtype=torch.uint8) /pytorch/aten/src/ATen/native/IndexingUtils.h:20: UserWarning: indexing with dtype torch.uint8 is now deprecated, please use a dtype torch.bool instead. /pytorch/aten/src/ATen/native/IndexingUtils.h:20: UserWarning: indexing with dtype torch.uint8 is now deprecated, please use a dtype torch.bool instead. /pytorch/aten/src/ATen/native/IndexingUtils.h:20: UserWarning: indexing with dtype torch.uint8 is now deprecated, please use a dtype torch.bool instead. Traceback (most recent call last): File "poemmaker.py", line 193, in <module> print(tokenizer.decode(token),end="") File "/home/root314/anaconda2/envs/poemGen/lib/python3.6/site-packages/transformers/tokenization_utils.py", line 1019, in decode sub_texts.append(self.convert_tokens_to_string(current_sub_text)) File "/home/root314/anaconda2/envs/poemGen/lib/python3.6/site-packages/transformers/tokenization_gpt2.py", line 208, in convert_tokens_to_string text = ''.join(tokens) TypeError: sequence item 0: expected str instance, NoneType found (poemGen) root314@sr-02631:~/poem-generator-master$

summerstay commented 4 years ago

I will work on trying to figure out how to fix the warnings. For now, you can just ignore them with the following lines:

import warnings
warnings.filterwarnings("ignore", category=UserWarning)

The program failed to generate anything after your prompt. I am unable to reproduce that behavior, but there is some randomness. I will try to put in something to catch this error and try again. There are a few things you can change to make this more likely to work, though:

Use a longer prompt. I recommend a whole sentence or paragraph. Otherwise it has too much freedom to generate things that aren't even legible text.
Use a longer line length. Anything below 8 tokens is probably too short. Note that the last two tokens in a line are the comma (or other ending punctuation) and the line return, and many tokens are less than a full word.
Use the larger gpt2-xl model. It takes a lot longer to download the first time and more disk space, but the output is more coherent.

You also might be interested in my other project "true_poetry". This includes meter constraints as well as rhyme constraints, which makes a big improvement.

summerstay commented 4 years ago

I have now included the warning suppression, added a more meaningful error message when nothing is generated, and added recommendations to the input prompts in the most recent version.

stevedipaola commented 4 years ago

Thanks, it works now - just needed longer prompt after your fix - as well I got it working with the gwern gpt-2 poetry 5B model. Also got your "true_poetry" system up and running with gwern gpt-2 poetry 5B model. The true poetry is a bit slower as it appears as it print out a lot of "thinking' ( I have to figure how to turn that off. Will play with both - thanks.

summerstay commented 4 years ago

To not display the thinking, you will just have to search the program for print statements and delete them or turn them into xprint statements (so they won't display unless "debug" is set to True). It won't speed it up, though: true_poetry is a slower program because it recursively backtracks to find a better answer. You can speed it up by adjusting the probability thresholds, at the cost of some quality.

summerstay / poem-generator

getting error: #1