axolotl-ai-cloud axolotl issues

axolotl-ai-cloud / axolotl

Go ahead and axolotl questions

https://axolotl-ai-cloud.github.io/axolotl/

Apache License 2.0

6.98k stars 766 forks source link

issues

Newest

Newest Most commented Recently updated Oldest Least commented Least recently updated

Training early stop without setting the early_stopping_patience parameters

#1775 leoozy opened 1 day ago
0
Migrate multipack to refactored flash attention

#1774 casper-hansen opened 2 days ago
1
swaps to use newer sample packing for mistral

#1773 winglian opened 2 days ago
0
add support for simpo via cpo trainer

#1772 winglian opened 2 days ago
0
Fix untrained tokens

#1771 winglian closed 2 days ago
1
Data Gets Tokenized Before Special Tokens Are Added

#1770 hammoudhasan opened 3 days ago
2
bump transformers and set roundup_power2_divisions for more VRAM improvements, low bit ao optimizers

#1769 winglian closed 3 days ago
0
Unsloth rope

#1767 winglian closed 3 days ago
0
Training Freeze after "Shuffle merged datasets" (and adding position ids)

#1766 e-p-armstrong closed 3 days ago
1
re-enable PYTORCH_CUDA_ALLOC_CONF expandable_segments

#1765 winglian closed 4 days ago
0
[DO NOT MERGE] bump accelerate and transformers to main

#1764 winglian opened 4 days ago
0
add torch_compile_mode options

#1763 winglian closed 4 days ago
0
set the number of dataset processes on the DPO Config rather than the trainer

#1762 winglian closed 4 days ago
0
Possible Bug in Chat Template Preprocessing

#1761 hammoudhasan opened 4 days ago
2
fix num gpu check

#1760 winglian closed 4 days ago
0
fixes to accelerator so that iterable pretraining datasets work

#1759 winglian closed 4 days ago
0
Enable Ascend NPU support

#1758 MengqingCao opened 6 days ago
1
update modal package and don't cache pip install

#1757 winglian closed 5 days ago
0
Add flexible configuration options for `chat_template` dataset training

#1756 Tostino opened 6 days ago
0
torch compile and cuda alloc improvements

#1755 winglian closed 5 days ago
0
support for llama multipack using updated code/patches

#1754 winglian closed 5 days ago
0
TinyLlama pretrain fails, but SFT works -- CUDA error: an illegal memory access was encountered

#1753 chromecast56 closed 4 days ago
5
add q-galore optimizer

#1752 winglian opened 1 week ago
1
Support for Flash Attention 3

#1751 creatorrr closed 1 week ago
2
RuntimeError: Cannot re-initialize CUDA in forked subprocess. To use CUDA with multiprocessing, you must use the 'spawn' start method

#1750 RishabhMaheshwary opened 1 week ago
7
Potential max_input_len Issue/Inconsistency?

#1749 williambarberjr opened 1 week ago
3
Error unfreezing intermediate layers

#1747 ayushsml opened 1 week ago
2
update to pytorch 2.3.1

#1746 winglian closed 1 week ago
0
add torch 2.3.1 base image

#1745 winglian closed 1 week ago
0
Changed URL for dataset docs

#1744 dameikle closed 1 week ago
0
fixes to prevent vram spike when train starts

#1742 winglian closed 1 week ago
1
Spwaning multiple processes on card 0 when performing distributed training on a single machine with multiple cards？

#1741 MengqingCao opened 1 week ago
0
bump xformers to 0.0.27

#1740 akshaylive closed 1 week ago
2
bump flash attention 2.5.8 -> 2.6.1

#1738 winglian closed 1 week ago
2
add tests so CI can catch updates where patches will break with unsloth

#1737 winglian closed 1 week ago
0
New Alignment Algorithm: SPPO

#1736 kaykyr opened 1 week ago
0
Implements SPPO Alignment Algoritm

#1735 kaykyr opened 1 week ago
2
Fixes the urls after org move

#1734 mhenrichsen closed 1 week ago
0
remove the bos token from dpo outputs

#1733 winglian opened 1 week ago
0
Allow using tokenizer's default chat template with fallbacks

#1732 chiragjn opened 1 week ago
8
Add option to raise error for long seqs + drop seqs with no outputs

#1731 chiragjn opened 1 week ago
0
bump trl and accelerate for latest releases

#1730 winglian closed 1 week ago
1
[Docs] Documentation Page 404

#1729 tjtanaa closed 1 week ago
1
Default to Chat Template in tokenizer_config

#1728 hammoudhasan closed 2 weeks ago
0
add basic support for the optimi adamw optimizer

#1727 winglian closed 1 week ago
0
full weights fsdp training seems broken with fsdp_cpu_ram_efficient_loading

#1726 winglian closed 2 weeks ago
0
Add a `chat_template` prompt strategy for DPO

#1725 fozziethebeat closed 21 hours ago
3
add support for .env files for env vars

#1724 winglian closed 2 weeks ago
0
Added eager_attention to AxolotlInputConfig

#1722 dameikle closed 2 weeks ago
1
New Optimizer: Implement Adam-Mini optimizer

#1720 SicariusSicariiStuff opened 3 weeks ago
0