issues
search
HabanaAI
/
Model-References
Reference models for Intel(R) Gaudi(R) AI Accelerator
155
stars
81
forks
source link
issues
Newest
Newest
Most commented
Recently updated
Oldest
Least commented
Least recently updated
Bump transformers from 4.19.2 to 4.38.0 in /PyTorch/examples/gpu_migration/generative_models/stable-diffusion
#56
dependabot[bot]
opened
2 weeks ago
0
Bump scikit-learn from 1.2.1 to 1.5.0 in /PyTorch/computer_vision/segmentation/Unet
#55
dependabot[bot]
opened
2 weeks ago
0
Using the method provided below, I am unable to download the datasets and models.
#54
wenchao987
opened
3 weeks ago
1
Fix the NLTKSegmenter failure issue
#53
YuningQiu
opened
1 month ago
0
YOLOX: Evaluation script is enabled for HPU
#52
dsmertin
opened
2 months ago
0
[Feature Request] YOLOX inference optimized on Gaudi2
#51
intelyoungway
opened
2 months ago
0
AttributeError: 'HabanaParameterWrapper' object has no attribute 'partition_numel' when running the Llama2 70B training benchmark
#50
Shivakumar-M
opened
3 months ago
2
Model-Reference content for 1.17.0
#49
Alberto-Villarreal
closed
3 months ago
3
Fix Readme
#48
Jianhong-Zhang
closed
2 months ago
1
Update the multi-tenants example to support torch.compile and disable lazy mode
#47
wangkl2
opened
4 months ago
0
Disable lazy mode and promote torch.compile in CV MNIST example
#46
ZailiWang
opened
4 months ago
0
remove lazy mode and add torch.compile
#45
ZhaoqiongZ
opened
4 months ago
0
[PyTorch/computer_vision] pass dtype to HPUModel in HPUJITModel init
#44
ctao456
opened
4 months ago
0
tgi-gaudi fix
#43
szutenberg
closed
3 months ago
0
Unet - inference issues
#42
kkurzacz-intel
opened
5 months ago
1
Blocked with missing Bert FT steps
#41
gbertulf
opened
6 months ago
1
"Training Data Packing" got error - RuntimeError: Maximum number of iterations reached.
#40
jingkang99
opened
6 months ago
3
where is habana_perf_tool located
#39
jingkang99
closed
8 months ago
1
download bert data from bertPrep.py failed
#38
Fred-cell
opened
9 months ago
0
HL_NUM_NODES=1 HL_PP=2 HL_TP=4 HL_DP=1 scripts/run_llama13b.sh command is stuck
#37
tileintel
closed
11 months ago
1
Is the command bloom fp8 inference on 8card wrong?
#36
BaihuiJin
opened
12 months ago
1
stable-diffusion-v-2-1 txt2img example fails with RuntimeError: Graph compile failed.
#35
ctodd
opened
1 year ago
3
[Megatron-DeepSpeed script] Fix memory usage did not get printed correctly
#34
kefeiyao
opened
1 year ago
0
remove hmp support, user should use autocast for mixed precision support
#33
huijuanzh
closed
1 year ago
0
MLPerf 3.0 multi-nodes supports
#32
neonadia
opened
1 year ago
1
Tensorflow Bert training continues evaluation
#31
PurvangL
closed
1 year ago
1
Multi-tenant test running ResNet50
#30
pradeepk-intel
opened
1 year ago
0
How to check if HPU exists?
#29
tengerye
closed
1 year ago
1
docker: Error response from daemon
#28
tengerye
closed
1 year ago
1
CVE-2007-4559 Patch
#27
TrellixVulnTeam
closed
2 years ago
1
Habana Gaudi HPUs Training time improvement
#26
purvang3
closed
1 year ago
2
How to execute pytorch on specific device?
#25
vuiseng9
opened
2 years ago
3
GPT2 from Model-Reference Training Hanging Issue
#24
yidinghabana
closed
2 years ago
1
Syntax Error in Line 61 : unet2d.py
#23
hitesh-anand
closed
2 years ago
2
What is the license of the model data you provide?
#22
jeremiah
closed
1 year ago
3
load_library issues with custom op
#21
eladhoffer
closed
2 years ago
2
Missing run_lazy_mode option in the argparser in PyTorch's Unet example
#20
JoeyTPChou
closed
2 years ago
3
tensor does not have a device
#19
anti-machinee
opened
2 years ago
2
resource_tracker: There appear to be 45 leaked semaphore objects to clean up at shutdown
#18
anti-machinee
opened
2 years ago
11
Got `urllib.error.HTTPError: HTTP Error 403: Forbidden` while downloading the pretrained model for the BERT Base and Huggingface DistilBERT model variant
#17
JoeyTPChou
closed
2 years ago
5
Missing script and command typo in Model-References/PyTorch/nlp/GPT2/GettingTheDataset.md document
#16
JoeyTPChou
closed
2 years ago
5
Memory of gaudi is occupied fully no mater how many batchsize is
#15
anti-machinee
closed
2 years ago
3
training time is slow because of PReLU
#14
anti-machinee
closed
2 years ago
6
What really memory of a single gaudi is
#13
anti-machinee
closed
2 years ago
3
Select index of gaudi automatically and can not use the available ones
#12
anti-machinee
closed
2 years ago
5
ArcFace layer works fine on CUDA but worse on Habana
#11
anti-machinee
closed
2 years ago
2
Error in loss.backward()
#10
anti-machinee
closed
2 years ago
4
test pr flow
#9
iluxaimeerovich
opened
3 years ago
32
ResNet50 Keras: Export to saved_model as stated by the documentation
#8
levzlotnik
opened
3 years ago
0
No runtime habana
#7
Addyvan
closed
3 years ago
1
Next