-
### 🚀 The feature, motivation and pitch
torchchat currently uses the hf hub which has it's own model cache, torchchat copies it into it's own model directory so you end up two copies of the same mode…
byjlw updated
1 month ago
-
### Describe the bug
So my issue is this error, I was working with same 'HF_token' having the write permission and I am working with Mistral Nemo 12B Instruct , the model was working well from last…
-
### Describe the bug
We fine tune models and store them on HuggingFace hub. The model names are built from datasetid__experiment name such as:
Company/client1_fielddata_[date]__favoritemodel_finetu…
-
How do I convert the PPO trained model (.pt) into hf format?
I tried to use this file to convert using. The following command:
```shell
python scripts/convert_checkpoint_to_hf.py \
--…
-
Hi!
First of all, amazing work! I'm trying to load the model with the pretrained weights from HF, but I'm receiving an error, while doing so.
My first attempt:
`model = AutoModelForSeq2SeqLM.from…
-
Hi,
Niels here from the open-source team at Hugging Face. I discovered your work through the paper page: https://huggingface.co/papers/2405.19707 (feel free to claim the paper so that it appears un…
-
The iterative prompting process takes a lot of time, maybe better to implement a similar stream to openAI api
```python
for resp in chatbot.query(
"Hello",
stream=True
):
print(resp…
-
Hello, thank you for providing this excellent model and repository. I encountered an issue while conducting my experiments with your codebase, and I’d appreciate your insights.
In my experiments, I…
-
Hello,
Direct3d sounds very interesting—congratulations on the paper!
I wonder if you would consider adding direct3d to the Hugging Face Hub. Doing so increase the model's visibility and make it eas…
-
Hello OLMoE Authors:
I have read the updates on the Sparse upcycling method in readme and tried to implement it. I want to reproduce the conclusions of Sparse Upcycling in your paper that load OLMo…