foundation-model-stack fms-hf-tuning issues

foundation-model-stack / fms-hf-tuning

🚀 Collection of tuning recipes with HuggingFace SFTTrainer and PyTorch FSDP.

Apache License 2.0

28 stars 48 forks source link

issues

Newest

Newest Most commented Recently updated Oldest Least commented Least recently updated

feat: Added outputs folder to gitignore

#396 Luka-D opened 3 days ago
1
feat: Added outputs folder to gitignore

#395 Luka-D closed 3 days ago
1
fix: bad name in generic tracker ADR

#394 dushyantbehl closed 4 days ago
1
fix: enable use of pretokenized dataset with padding_free

#393 HarikrishnanBalagopal closed 5 days ago
2
fix: support dataset directory created with save_to_disk

#392 HarikrishnanBalagopal closed 6 days ago
1
chore(release): merge set of changes for v2.1.2

#391 willmj closed 6 days ago
1
feat: [WIP] dataclass args for accelerated MoE tuning

#390 willmj opened 1 week ago
3
docs: Update supported models

#389 aluu317 closed 2 weeks ago
1
build: Jim and Alex dev image

#388 jbusche opened 2 weeks ago
1
build(deps): set lower limit for transformers to 4.45 for granite 3.0

#387 willmj closed 2 weeks ago
2
chore(release): merge set of changes for v2.1.1

#386 anhuong closed 2 weeks ago
2
build(deps): Update accelerate requirement from !=0.34,<1.1,>=0.20.3 to >=0.20.3,!=0.34,<1.2

#385 dependabot[bot] opened 3 weeks ago
1
build(deps): set transformers below 4.46, waiting on fixes

#384 anhuong closed 3 weeks ago
1
test: get highest checkpoint instead of hard coded path

#383 anhuong opened 3 weeks ago
13
docs: Update Supported Models List in README

#382 tharapalanivel closed 3 weeks ago
2
feat: DataProcessor v2

#381 dushyantbehl opened 4 weeks ago
1
feature request: custom templating logic doesn't support `{{ input }}` , only `{{input}}`, use Jinja2 instead https://jinja.palletsprojects.com/en/3.1.x/sandbox/

#380 HarikrishnanBalagopal opened 1 month ago
0
add scanner; output directly to stdout

#379 ChanderG closed 1 month ago
1
feat: Add timing tracker to fms-hf-tuning

#378 willmj opened 1 month ago
6
build(deps): Update torch requirement from <2.5,>=2.2.0 to >=2.2.0,<2.6

#377 dependabot[bot] opened 1 month ago
1
chore: merge set of changes for v2.1.0

#376 aluu317 closed 1 month ago
2
build(deps): torch<2.5 due to FA2 error with new version

#375 anhuong closed 1 month ago
1
docs: Data Preprocessor ADR

#374 dushyantbehl closed 4 days ago
2
Inconsistent GPU memory usage of QLoRA (vs LoRA) with different numbers of GPUs

#373 albertoperdomo2 opened 1 month ago
5
build(deps): Update accelerate requirement from !=0.34,<0.35,>=0.20.3 to >=0.20.3,!=0.34,<1.1

#372 dependabot[bot] closed 1 month ago
2
build(deps): Upgrade accelerate requirement to allow version 1.0.0

#371 willmj closed 1 month ago
2
build: Set triton environment variables

#370 willmj closed 1 month ago
2
build(deps): Update simpleeval requirement from <1.0,>=0.9.13 to >=0.9.13,<2.0

#369 dependabot[bot] opened 1 month ago
1
refactor: Removal of pad_token when padding_free is set

#368 Abhishek-TAMU closed 1 month ago
6
FMS-HF-Tuning QLoRA fine-tuning crashes because it cannot access `/.triton` in OpenShift

#367 kpouget closed 1 month ago
1
build(deps): Update accelerate requirement from <0.34,>=0.20.3 to >=0.20.3,<1.1

#366 dependabot[bot] closed 1 month ago
2
fix: crash when output directory doesn't exist

#365 HarikrishnanBalagopal closed 1 month ago
6
fix: crash when output directory doesn't exist

#364 HarikrishnanBalagopal closed 1 month ago
4
chore: update code owners

#363 anhuong closed 1 month ago
1
DO NOTE MERGE: test updating transformers

#362 anhuong closed 1 month ago
2
ci: run unit tests, fmt, image build on release branch

#361 anhuong closed 1 month ago
1
fix: Revert changes from PR#326 with trl version change

#360 willmj closed 1 month ago
3
bug: crash when the `--output_dir` doesn't exist and multiple GPUs (processes) are used.

#359 HarikrishnanBalagopal closed 1 month ago
2
build(deps): unset hardcoded trl version to get latest updates

#358 anhuong closed 1 month ago
4
release: merge set of changes for v2.0.0

#357 Abhishek-TAMU closed 1 month ago
0
fix: check for wte.weight along with embed_tokens.weight

#356 willmj closed 1 month ago
1
build(deps): update transformers and accelerate deps

#355 anhuong closed 1 month ago
3
build(deps): Update peft requirement from <0.13,>=0.8.0 to >=0.8.0,<0.14

#354 dependabot[bot] closed 1 month ago
1
build(deps): Update transformers requirement from <4.45,>4.41 to >4.41,<4.46

#353 dependabot[bot] closed 1 month ago
2
fix: unable to find output_dir in multi-GPU during resume_from_checkpoint check

#352 Abhishek-TAMU closed 2 months ago
2
feat: Add post processing logic to accelerate launch

#351 willmj closed 2 months ago
1
build: install additional fms-acceleration plugins

#350 anhuong closed 2 months ago
1
fix: cap transformers at v4.44

#349 anhuong closed 2 months ago
3
refactor: move tokenizer_data_utils with the rest of utils, add further unit testing.

#348 willmj closed 1 month ago
2
Missing dependency in fms-hf-tuning image for `padding_free` and `multipack`

#347 sutaakar closed 1 month ago
2