Open cv277 opened 1 year ago
top_p=10 might be <1?
top_p=10 might be <1?
Unfortunately I still get the error after setting top_p to a value less than one. Thank you though!
I am getting a warning and an error which are as follows:
Warning: You're using a PreTrainedTokenizerFast tokenizer. Please note that with a fast tokenizer, using the __call__
method is faster than using a method to encode the text followed by a call to the pad
method to get a padded encoding.
Error: RuntimeError: "LayerNormKernelImpl" not implemented for 'Half'.
@cv277 where you able to resolve the issue?
To fix this, you should use torch_dtype=torch.float32
instead.
我想知道数据集的样式是什么样的,能否提供呢
I've switched to torch_dtype=torch.float32 but am still getting this issue for progen-base and larger models, but not for progen-small when I'm calling
model = ProGenForCausalLM.from_pretrained('/content/drive/MyDrive/progen2-small', torch_dtype=torch.float32, low_cpu_mem_usage=True).to(device)
Has anyone experienced similar issues or is there somewhere else I need to change the dtype?
@oliverfleetwood that works for me, I tried loading the progen2-large model and it loads fine - what error are you encountering?
First I only ran on cpu. After upgrading cuda and reinstalling torch, I was able to run the larger models on a GPU with the same setup. I still get the same error as I try to run the larger models (ie all except for progen-small) on CPU
I want to fine-tune ProGen2-small on my own dataset. See this google colab notebook for an annotated version of the code and the error: https://colab.research.google.com/drive/1_R0xgf6Kw0K88PYF7-ZOCIh9WRSmXN8C?usp=sharing
First I load the model like this:
I am using the huggingface Trainer to fine-tune the model with the DataCollatorForLanguageModeling. I load the tokenizer like this:
And then convert it to a PreTrainedTokenizerFast as suggested by: https://github.com/huggingface/tokenizers/issues/325
During fine-tuning, the training loss becomes 0.0000. After training, I attempt to produce new samples:
However, I get this error: RuntimeError: probability tensor contains either
inf
,nan
or element < 0 Please see the above google colab notebook for the entire code.