issues
search
huggingface
/
optimum-habana
Easy and lightning fast training of 🤗 Transformers on Habana Gaudi processor (HPU)
Apache License 2.0
153
stars
202
forks
source link
issues
Newest
Newest
Most commented
Recently updated
Oldest
Least commented
Least recently updated
Added custom mamba op and fix the mamba cache issue
#1521
zzhang37
opened
2 hours ago
0
implement fused sdpa for wav2vec2 (#18)
#1520
astachowiczhabana
opened
3 hours ago
1
Add support for optimized SDXL pipeline
#1519
sushildubey171
opened
5 hours ago
1
[SW-205385] Add DynamicMoE support for Mixtral (#10)
#1518
astachowiczhabana
opened
6 hours ago
1
[SW-209062] Disable default sdpa in Albert (#22)
#1517
astachowiczhabana
opened
6 hours ago
1
[SW-200913] Removed workaround for NaN bug causing graph break.
#1516
astachowiczhabana
opened
8 hours ago
1
[SW-198498] pass "lazy_mode" arg to GaudiLlamaModel GaudiTrainer
#1515
astachowiczhabana
opened
8 hours ago
1
[SW-208980] Add option to use bf16 in PT sdp (#5)
#1514
astachowiczhabana
opened
9 hours ago
1
[SW-204998] Memory optimization for gpt_bitcode (#4)
#1513
astachowiczhabana
opened
9 hours ago
1
add Qwen2-VL static generation
#1512
Spycsh
opened
14 hours ago
1
Add DynamicMoE support for Mixtral
#1511
kwisniewski98
opened
1 day ago
1
Fixed Gemma FP8 flash_attention lower throughput issue
#1510
kplau1128
opened
1 day ago
6
enable dynamic compile for mpi
#1509
chaojun-zhang
opened
1 day ago
2
[wav2vec2] Remove tensor.item and dynamic slicing operations in the loop that cause graph break
#1508
chaojun-zhang
opened
1 day ago
1
Migrate OH CLIP (roberta-clip) training to torch.compile
#1507
chaojun-zhang
opened
1 day ago
1
Migrate OH T5-large training to torch.compile
#1506
chaojun-zhang
opened
1 day ago
1
Support FP8 model fallback KVCache to bfloat16
#1505
changwangss
opened
2 days ago
0
Pr1280 fix
#1504
Luca-Calabria
opened
2 days ago
5
fix lora padding loss problem
#1503
ranzhejiang
opened
2 days ago
0
add fused kernel config support for run_clm.py
#1502
ranzhejiang
opened
2 days ago
0
Adding support for Context Parallelism using Deepseed's DistributedAt…
#1501
bhargaveede
opened
2 days ago
2
add check_neural_compressor_min_version for 4 bit behavior
#1500
xin3he
opened
2 days ago
2
[WIP] Diffusers upgrade 0.31.0
#1499
pi314ever
closed
2 days ago
6
Move fast tests to Gaudi2
#1498
regisss
closed
3 days ago
1
Makes the with_stack of the profiler changeable
#1497
ranzhejiang
opened
4 days ago
3
Fixes in unify_measurements
#1496
HolyFalafel
opened
5 days ago
0
Refactor resolve_beam to fix recursion depth issue
#1495
Yanli2190
closed
6 days ago
1
Upgrade ViT README with torch.compile
#1494
astachowiczhabana
closed
1 day ago
1
Fix trust_remote_code
#1493
astachowiczhabana
closed
1 day ago
1
Remove keep_input_mutations
#1492
astachowiczhabana
closed
1 day ago
1
Remove torch req from LM example
#1491
astachowiczhabana
closed
1 day ago
1
enable tiiuae/falcon-11B-vlm in image_to_text example. fix the incorr…
#1490
sywangyi
opened
1 week ago
2
Add warmup time and compile time log for the eval/prediction.
#1489
jiminha
opened
1 week ago
1
Optimum-Habana docs re-org
#1488
dsocek
opened
1 week ago
4
support llava1.5 lora finetuning.
#1487
lkk12014402
opened
1 week ago
1
text-generation: improve output printing
#1486
mgonchar
opened
1 week ago
3
readme: replace tabs with spaces
#1485
mgonchar
closed
6 days ago
1
Intel Gaudi MPI job for multi-node,multi-card
#1484
sramakintel
closed
1 week ago
0
Create MPI job for multi-node multi-card gaudi workflow
#1483
sramakintel
closed
1 week ago
0
FLUX Fine-Tuning for Gaudi
#1482
dsocek
opened
1 week ago
1
Fix bridgetower example (#312)
#1481
astachowiczhabana
opened
1 week ago
4
Enable Falcon-mamba
#1480
yuanwu2017
opened
1 week ago
5
Add support for Baichuan2
#1479
xhaihao
opened
1 week ago
2
Add chatglm
#1478
mengker33
opened
1 week ago
1
Error when running chatglm3_6b: NotImplementedError: Unknown device for graph fuser
#1477
BaideBear
opened
1 week ago
1
Support loading 4 bit Qwen2
#1476
mengniwang95
closed
3 days ago
5
enable DeepSeek-V2
#1475
yao-matrix
opened
2 weeks ago
3
Revert "fix bug when loading 4bit checkpoint quantized in INC"
#1474
xin3he
closed
2 weeks ago
0
update lm_eval version
#1473
alexey-belyakov
opened
2 weeks ago
1
Support beam search with reuse_cache and bucket_internal
#1472
Wei-Lin-Intel
opened
2 weeks ago
5
Next