HabanaAI Model-References issues

HabanaAI / Model-References

Reference models for Intel(R) Gaudi(R) AI Accelerator

155 stars 81 forks source link

issues

Newest

Newest Most commented Recently updated Oldest Least commented Least recently updated

Bump transformers from 4.19.2 to 4.38.0 in /PyTorch/examples/gpu_migration/generative_models/stable-diffusion

#56 dependabot[bot] opened 2 weeks ago
0
Bump scikit-learn from 1.2.1 to 1.5.0 in /PyTorch/computer_vision/segmentation/Unet

#55 dependabot[bot] opened 2 weeks ago
0
Using the method provided below, I am unable to download the datasets and models.

#54 wenchao987 opened 3 weeks ago
1
Fix the NLTKSegmenter failure issue

#53 YuningQiu opened 1 month ago
0
YOLOX: Evaluation script is enabled for HPU

#52 dsmertin opened 2 months ago
0
[Feature Request] YOLOX inference optimized on Gaudi2

#51 intelyoungway opened 2 months ago
0
AttributeError: 'HabanaParameterWrapper' object has no attribute 'partition_numel' when running the Llama2 70B training benchmark

#50 Shivakumar-M opened 3 months ago
2
Model-Reference content for 1.17.0

#49 Alberto-Villarreal closed 3 months ago
3
Fix Readme

#48 Jianhong-Zhang closed 2 months ago
1
Update the multi-tenants example to support torch.compile and disable lazy mode

#47 wangkl2 opened 4 months ago
0
Disable lazy mode and promote torch.compile in CV MNIST example

#46 ZailiWang opened 4 months ago
0
remove lazy mode and add torch.compile

#45 ZhaoqiongZ opened 4 months ago
0
[PyTorch/computer_vision] pass dtype to HPUModel in HPUJITModel init

#44 ctao456 opened 4 months ago
0
tgi-gaudi fix

#43 szutenberg closed 3 months ago
0
Unet - inference issues

#42 kkurzacz-intel opened 5 months ago
1
Blocked with missing Bert FT steps

#41 gbertulf opened 6 months ago
1
"Training Data Packing" got error - RuntimeError: Maximum number of iterations reached.

#40 jingkang99 opened 6 months ago
3
where is habana_perf_tool located

#39 jingkang99 closed 8 months ago
1
download bert data from bertPrep.py failed

#38 Fred-cell opened 9 months ago
0
HL_NUM_NODES=1 HL_PP=2 HL_TP=4 HL_DP=1 scripts/run_llama13b.sh command is stuck

#37 tileintel closed 11 months ago
1
Is the command bloom fp8 inference on 8card wrong?

#36 BaihuiJin opened 12 months ago
1
stable-diffusion-v-2-1 txt2img example fails with RuntimeError: Graph compile failed.

#35 ctodd opened 1 year ago
3
[Megatron-DeepSpeed script] Fix memory usage did not get printed correctly

#34 kefeiyao opened 1 year ago
0
remove hmp support, user should use autocast for mixed precision support

#33 huijuanzh closed 1 year ago
0
MLPerf 3.0 multi-nodes supports

#32 neonadia opened 1 year ago
1
Tensorflow Bert training continues evaluation

#31 PurvangL closed 1 year ago
1
Multi-tenant test running ResNet50

#30 pradeepk-intel opened 1 year ago
0
How to check if HPU exists?

#29 tengerye closed 1 year ago
1
docker: Error response from daemon

#28 tengerye closed 1 year ago
1
CVE-2007-4559 Patch

#27 TrellixVulnTeam closed 2 years ago
1
Habana Gaudi HPUs Training time improvement

#26 purvang3 closed 1 year ago
2
How to execute pytorch on specific device?

#25 vuiseng9 opened 2 years ago
3
GPT2 from Model-Reference Training Hanging Issue

#24 yidinghabana closed 2 years ago
1
Syntax Error in Line 61 : unet2d.py

#23 hitesh-anand closed 2 years ago
2
What is the license of the model data you provide?

#22 jeremiah closed 1 year ago
3
load_library issues with custom op

#21 eladhoffer closed 2 years ago
2
Missing run_lazy_mode option in the argparser in PyTorch's Unet example

#20 JoeyTPChou closed 2 years ago
3
tensor does not have a device

#19 anti-machinee opened 2 years ago
2
resource_tracker: There appear to be 45 leaked semaphore objects to clean up at shutdown

#18 anti-machinee opened 2 years ago
11
Got `urllib.error.HTTPError: HTTP Error 403: Forbidden` while downloading the pretrained model for the BERT Base and Huggingface DistilBERT model variant

#17 JoeyTPChou closed 2 years ago
5
Missing script and command typo in Model-References/PyTorch/nlp/GPT2/GettingTheDataset.md document

#16 JoeyTPChou closed 2 years ago
5
Memory of gaudi is occupied fully no mater how many batchsize is

#15 anti-machinee closed 2 years ago
3
training time is slow because of PReLU

#14 anti-machinee closed 2 years ago
6
What really memory of a single gaudi is

#13 anti-machinee closed 2 years ago
3
Select index of gaudi automatically and can not use the available ones

#12 anti-machinee closed 2 years ago
5
ArcFace layer works fine on CUDA but worse on Habana

#11 anti-machinee closed 2 years ago
2
Error in loss.backward()

#10 anti-machinee closed 2 years ago
4
test pr flow

#9 iluxaimeerovich opened 3 years ago
32
ResNet50 Keras: Export to saved_model as stated by the documentation

#8 levzlotnik opened 3 years ago
0
No runtime habana

#7 Addyvan closed 3 years ago
1