-
**Suggested steps:**
* [ ] Define unsupervised learning tasks, i.e., learning tasks that don't required truth-level labels but instead relies solely on the reconstruction-level data. This is the same…
-
Hi,
I am trying to apply reward modelling to an IterableDataset. I am having an issue with a strange failure mode that I am struggling to debug. I can replicate the same stack trace in the reward_m…
-
In attached, please find output of webui and ollama server console. At line 1 of webui output, I ask the question, using llama3:latest (line 3). Result is shown in lines 4-42
At line 45, I ask sam…
-
When doing inference on Gemma-2-2B with Flash Attention 2, I get the following error. It works just fine with Flash Attention disabled.
transformers==4.44.0
torch==2.4.0
flash-attn==2.6.3
python…
-
### 🚀 The feature, motivation and pitch
[GPT-2 SDPA](https://github.com/huggingface/transformers/blob/main/src/transformers/models/gpt2/modeling_gpt2.py#L182-L220) pattern is not currently being ma…
-
Hello,
Thanks for creating this very helpful tool!
I am fine-tuning the **_model (GPT-J-6B)_** for the question answering on the private documents. I have 1000+ documents and they are all in text f…
-
https://aclanthology.org/2024.eacl-long.105
-
# 问题
转换完权重之后进行评估验证时出现下述问题
```shell
> number of parameters on (tensor, pipeline) model parallel rank (0, 0): 630167424
loading release checkpoint from /raid/LLM_train/Pai-Megatron-Patch/checkpoint…
-
I'm running Ollama on my mac M1 and I'm trying to use the 7b models for processing batches of questions / answers.
I noticed that after a while ollama just hang and the process stay there forever.
…
-
Facing an issue while setting up the repo and installing during this step
> pip3 install -e .
The error is coming while building wheel for xformers package. I am using MacBook M1. Any leads wou…