-
https://huggingface.co/docs/transformers/v4.38.2/perf_train_gpu_one#gradient-accumulation
In the `TrainingArguments` passed to `SFTTrainer`, we can likely reduce the total GPU memory required to tr…
-
How can the model evaluate on GLEU tasks?The tasks are text-pure, but in the paper it said “Similar to PLM, when prefix image is none, this task will degenerate into “text-to-image generation” task, f…
-
系统环境是WINDOWS+torch2.4.1+cuda12.4错误信息如下
ayerUtility: JoyCaption2
apply_chat_template requires jinja2>=3.1.0 to be installed. Your version is 3.0.3.
2024-10-12 01:11:09,476 - root - INFO - got prom…
-
Currently, LinkML's language definition includes constructs that aren't fully supported by all generators.
Identifying these gaps often requires trial-and-error (ie finding out the hard way), creat…
-
Hello!
I did some research (using llama.cpp) and I found out that quantizing the input and embed tensors to f16 and the other tensors to q5_k or q6_k gives excellent results and almost indistinguisha…
-
### Prerequisites
- [X] I am running the latest code. Mention the version if possible as well.
- [X] I carefully followed the [README.md](https://github.com/ggerganov/llama.cpp/blob/master/README.md)…
-
Hi!
I'm not sure if this is a problem that can be solved, or needs to be solved. Basically, we want to make a kind of hybrid tokenizer, in which we add a whole bunch of whole words to a tokenizer, …
-
E.g. in a pure Clojure/Cljs/Datomic stack this might not be necessary. For others, metosin/spec-tools transformers might be a better option.
drgif updated
3 years ago
-
Subscribe to this issue and stay notified about new [daily trending repos in PureScript](https://github.com/trending/purescript?since=daily).
-
### Description
I am getting following error if i load model in `predict ` model. it works perfectly in `eval` mode.
```
ValueError: Incompatible shapes for matmul arguments: (8, 1, 64) and (2…