-
First open LLM from [@SnowflakeDB](https://twitter.com/SnowflakeDB)! Arctic is 480B Dense-MoE with a 10B dense transformer model and a 128x3.66B MoE MLP designed specifically for enterprise AI. 🤔
T…
-
In this branch: https://github.com/huggingface/safetensors/compare/julien-c/js I pushed a proof-of-concept of how, given the simplicity of the format, one can fetch metadata about the weights over sma…
-
I tried to train for 10000 iterations but results are -1
2021-07-02 11:05:55,223 fcos_core INFO: Using 2 GPUs
2021-07-02 11:05:55,223 fcos_core INFO: Namespace(config_file='configs/fcos/fcos_R_50_…
-
Hello!
I'm a senior high student,and I would like to inquire if there are any methods available to convert a checkpoint from Megatron-LM into the Hugging Face format after training.
This is bec…
-
### Checklist
- [X] 1. I have searched related issues but cannot get the expected help.
- [X] 2. The bug has not been fixed in the latest version.
### Describe the bug
i have fineturne xcomp…
-
# bash command
```\
output_dir='lora/OneKE'
mkdir -p ${output_dir}
CUDA_VISIBLE_DEVICES="3" python3 src/finetune.py \
--do_train --do_eval \
--overwrite_output_dir \
--model_name_or…
-
@raoyongming Thanks for providing the ImageNet weights. I was trying to load the weights that you have shared to do some analysis for benchmarking, however, when I am loading it for `gfnet-ti` or `gf…
-
We are interested in implementing this model architecture to our MSD segmentation task with possibly making small modifications to the model architecture itself. What do we have to change to make this…
-
### Your current environment
The output of `python collect_env.py`
```text
PyTorch version: 2.4.0+cu121
Is debug build: False
CUDA used to build PyTorch: 12.1
ROCM used to build PyTorch: N/A…
-
### System Info / 系統信息
![Uploading 5.PNG…]()
### Running Xinference with Docker? / 是否使用 Docker 运行 Xinfernece?
- [X] docker / docker
- [ ] pip install / 通过 pip install 安装
- [ ] installation from s…