-
deepseek-chat模型LorA微调完没有adapter_config.json,看其他issue里说,是因为transformer的版本问题,LorA微调出来的参数和基座模型的参数直接合并了,但是直接运行Lora微调出来的参数,模型回复的所有token都是0~,感觉是没有进行推理~
-
Hello Villu,
Thank you for this great package for exporting the spark ml models. But this package seems not easy to work with:
My input: a column named 'sentence'
My output: a column named 'predi…
-
Reading [section 4.34 of "CSS Syntax Level 3"](https://www.w3.org/TR/css-syntax-3/#consume-ident-like-token) I am confused by two sentences, which I am unable to properly understand in order to, say, …
-
Hi! I am using the nllb models for the first time and I am having some trouble for making tranlations of complete documents. I am following the same structure as the hugginface tutorial (https://huggi…
-
> WordLlama begins by extracting the token embedding codebook from a state-of-the-art LLM (e.g., LLama3 70B), and training a small context-less model in a general purpose embedding framework.
I rea…
-
## Describe the bug
I am trying to eliminate this self-chattiness following several methods found over the internet. But there's no solution yet. Can anyone please help with this? I have been stuck w…
-
Thanks for open sourcing the code.
I managed to train a model, however I can not run inference on a simple sentence, since it seems I am not correctly calling the model.
Would be glad If you could…
-
Hi! I applied your sentiment model to a df column, at the beginning everything worked fine but few minutes before I got the RuntimeError: generator raised StopIteration. Do you have any idea why and n…
-
First, thank you so much for sentence-transformer.
How to get embedding vector when input is tokenized already?
i guess sentence-transformer can `.encode(original text)`.
But i want …
sogm1 updated
8 months ago
-
I know that the following can calculate loss
![image](https://github.com/ftramer/LM_Memorization/assets/84905965/4624d05e-a046-4850-9903-f12fd864e3e2)
However, why labels be input_id? After read the…