-
# Prerequisites
Please answer the following questions for yourself before submitting an issue.
- [X] I am running the latest code. Development is very rapid so there are no tagged versions as of…
-
### Describe the issue
I am facing error's with DataParallel.
-
# Description
Current challenges in using Neural Operators are: irregular meshes, multiple inputs, multiple inputs on different meshes, or multi-scale problems. [1] The Attention mechanism is promi…
-
尝试在星辰开源代码库中的modeling_telechat添加TelechatForSequenceClassification方法类(分别参照qwen和星辰自己代码),会分别出现无法加载模型的错误和训练损失不下降的情况。需要AI公司帮忙一起看看怎么支持AutoModelForSequenceClassification任务。
class TelechatForSequenceClassif…
tcoln updated
2 weeks ago
-
I tried to fine tune [TinyLlama](https://huggingface.co/TinyLlama/TinyLlama-1.1B-Chat-v1.0) with this crate. After training, the safetensors saved only contains two tensors:
```
lora_llama.b0
lora_…
-
### Checklist
- [X] The issue has not been resolved by following the [troubleshooting guide](https://github.com/lllyasviel/Fooocus/blob/main/troubleshoot.md)
- [ ] The issue exists on a clean install…
-
### Model description
BASED is an attention model which combines sliding window attention and global linear attention to capture similar dependencies to transformers in a subquadratic model.
It …
-
Dear authors,
I encountered weights explosion problems during integrating LoRA to torchtitan. I am running with train_configs/llama3_8b.toml configs with run_llama_train.sh on 4 A10 24GB GPUs. PyT…
-
Hello, thank you for the amazing work, is it possible to use Qlora to fine tune the 4bit quant models?
-
Hello,
congratulations for your awesome work!
When I am trying to insert my own tif files I am getting a black image as result.
Am I missing something?
Thank you for your effort, and I look forwar…