-
### Reference code
- Llama-recipes code
[https://github.com/meta-llama/llama-recipes/tree/b7fd81c71239c67345d897c0eb6529eba076e8b8](https://github.com/meta-llama/llama-recipes/tree/b7fd81c71239c…
-
- [ ] [WisdomShell/kieval: A Knowledge-grounded Interactive Evaluation Framework for Large Language Models](https://github.com/WisdomShell/kieval)
# WisdomShell/kieval: A Knowledge-grounded Interacti…
-
*Concise Description:*
I'd like to use JAX for distributed training of LLMs. In addition, the new release of Keras supports JAX as a backend in addition to TF.
*Describe the solution you'd like*
…
-
thanks for your awesome work!
I was wondering if you got any results on vision models like vit or stable diffusion?
-
@mukel thank you for creating this project! I would like to discuss the following topics:
1. Please enable the Discussions tab for posts like this, which are not real "issues"
2. Do you plan on rele…
-
Here is the Google Colab link I used for fine-tuning :
[https://colab.research.google.com/drive/1kiALBR1UarPobiftZmiHfwFyk7hTCDnV?usp=sharing](url)
When I fine-tune the LLM-embed for tool retriev…
-
Hi, I am trying to finetune llama on commonsense_170k. However, I find the when the loss value is around 0.6, it almost does not decrease. Is it normal?
` WORLD_SIZE=2 CUDA_VISIBLE_DEVICES=1,2,3,4 …
-
The main limitation of LLMs is the huge model size, plus, during training, the required VRAM/RAM necessary to store the model + the backpropagation parameters are much higher than during inference.
A…
-
如题:命令如下:
python pretrain.py --pretrained_model_path models/llama-7b.bin --dataset_path datasets/ceshi --spm_model_path /u01/wangcheng/llm/llama/tokenizer.model --config_path models/llama/7b_config.js…
-
## DIN-SQL: Decomposed In-Context Learning of Text-to-SQL with Self-Correction
### Summary by Copilot
- **DIN-SQL** stands for **Decomposed In-Context Learning of Text-to-SQL with Self-Correctio…