ServiceNow Fast-LLM issues

ServiceNow / Fast-LLM

Accelerating your LLM training to full speed

https://servicenow.github.io/Fast-LLM/

Other

38 stars 5 forks source link

issues

Newest

Newest Most commented Recently updated Oldest Least commented Least recently updated

[feat] don't download if already downloaded

#60 tscholak closed 14 hours ago
0
[bug] Nans and/or desync for sequence-tensor-parallel.

#59 jlamypoirier opened 1 day ago
0
Check for nans in TP desync check

#58 jlamypoirier closed 1 day ago
0
[bug] Crash with sequence-data-parallel

#57 jlamypoirier opened 1 day ago
0
[bug] Sparse copy runs out of shared memory with many experts

#56 sohamparikh opened 1 day ago
1
llama3 rope

#55 RaymondLi0 opened 1 day ago
3
[feat] Require linting to pass before merging

#54 jlamypoirier opened 1 day ago
0
fix z loss keyword arg

#53 sohamparikh closed 1 day ago
0
[feat] Support triton cross-entropy for larger vocabularies

#52 jlamypoirier opened 2 days ago
0
Quick-start guide feedback

#51 tscholak opened 2 days ago
3
Continuation from #49

#50 tscholak opened 2 days ago
0
Improve quickstart guide

#49 jlamypoirier opened 3 days ago
2
clamping initialized weights

#48 sohamparikh closed 20 hours ago
0
Fix checkpoint backward compatibility, improve metadata error reporting

#47 jlamypoirier closed 3 days ago
0
Add distributed init method to prepare command

#46 tscholak closed 2 days ago
1
Intermittent Triton Kernel Compilation Failure in Fast-LLM Due to Stale File Handle (Errno 116)

#45 tscholak opened 6 days ago
0
Split dataset

#44 jlamypoirier closed 3 days ago
0
Llama 70B benchmarking

#43 rafapi closed 1 day ago
1
Fix editable installation

#42 tscholak closed 1 week ago
0
Dataset from modular configuration

#41 jlamypoirier opened 1 week ago
0
Samplable dataset

#40 jlamypoirier closed 1 week ago
0
[feat] Llama 3.x rope scaling support

#39 tscholak opened 1 week ago
2
Add prepare command

#38 tscholak closed 1 week ago
2
Dataset wrapper classes

#37 jlamypoirier closed 1 week ago
0
Fix rope scaling when loading llama configs

#36 tscholak closed 1 week ago
0
Refactor GPT data

#35 jlamypoirier closed 1 week ago
0
[Prototype] Flexible dataset configuration

#34 jlamypoirier opened 2 weeks ago
0
New long-term checkpoint format

#33 jlamypoirier closed 2 weeks ago
0
[build] cpp compile during setup

#32 tscholak closed 2 weeks ago
2
Checkpoint format

#31 jlamypoirier closed 3 weeks ago
0
Fix fast distributed checkpoint loading

#30 jlamypoirier closed 3 weeks ago
0
[enhancement] Integrate C++ compilation in setup.py to streamline installation

#29 tscholak closed 2 weeks ago
1
Checkpoint metadata

#28 jlamypoirier closed 3 weeks ago
0
Roadmap

#27 jlamypoirier opened 3 weeks ago
1
Speed up checkpoint serialization

#26 jlamypoirier opened 3 weeks ago
3
[feat] Integrate dataset re-weighting and preprocessing into Fast-LLM for streamlined data loading

#25 tscholak opened 3 weeks ago
7
Fix faster tests

#24 jlamypoirier closed 4 weeks ago
0
Faster tests

#23 jlamypoirier closed 4 weeks ago
0
Modular checkpointing

#22 jlamypoirier closed 3 weeks ago
0
Checkpoint submodule

#21 jlamypoirier closed 4 weeks ago
0
[WIP] GRPO

#20 rafapi opened 4 weeks ago
0
Fix export dtype

#19 jlamypoirier closed 4 weeks ago
0
Simplified checkpoint loading

#18 jlamypoirier closed 4 weeks ago
0
[docs] revamp documentation

#17 tscholak closed 3 days ago
1
Update CONTRIBUTING.md

#16 tscholak closed 1 month ago
0
Update SECURITY.md

#15 hughesthe1st opened 1 month ago
3
Update SECURITY.md

#14 tscholak closed 1 month ago
0
Add feature request template

#13 tscholak closed 1 month ago
0
[PROTOTYPE] Unified save/load

#12 jlamypoirier closed 4 weeks ago
0
Add bug template

#11 tscholak closed 1 month ago
0