large-kernels Search Results

1000+ results
for large-kernels

Best match

Best match Most commented Newest Recently updated Least commented Oldest Least recently updated

jupyterlab/jupyterlab #16276

Slower atexit methods do not run to completion

## Description Cleanup methods that are slower do not run to completion when restarting kernel. ## Reproduce 1. Create a new IPython notebook 2. Create and execute a new cell wit…

parantapa updated 6 months ago
5
cloneofsimo/minRF #1

Support better kernel fusion for MMDiT architecture

either torch.compile / triton, forward / backward operations got too much activations that are probably bottlenecking training. For some reason, i got about 30% speedup at 1B scale but does not seem …

cloneofsimo updated 6 months ago
2
pytorch/pytorch #131110

Proposal for CUDA-Accelerated Dynamic Time Warping (DTW) Imp…

### 🚀 The feature, motivation and pitch DTW is a crucial algorithm for measuring similarity between temporal sequences, but its computational complexity can be a bottleneck, particularly with large…

urstrulyvishtan updated 3 months ago
3
PaddlePaddle/PaddleOCR #12440

ch_PP-OCRv4_rec_svtr_large.yml训练导出的模型，使用predict_rec.py预测宽度比较…

使用ch_PP-OCRv4_rec_svtr_large.yml训练的OCR识别模型，训练正常，使用python tools/eval.py -c configs/rec/PP-OCRv4/ch_PP-OCRv4_rec_svtr_large.yml也是正常的，用python tools/export_model.py -c configs/rec/PP-OCRv4/ch_PP-OCRv4…

LMR2018 updated 4 months ago
9
KhronosGroup/SYCL-Docs #229

Allowing "dead kernel" optimisation

Actually the SYCL specification does not allow to remove dead/unreachable kernels at least because of free functions such as `get_kernel_bundle` (which could be called indirectly from another translat…

Michoumichmich updated 2 years ago
9
E3SM-Project/EKAT #254

Potential problem with ExeSpaceUtils view_reduction and para…

**Describe the bug** This was discovered when porting shoc_energy_integrals to small kernels. I was getting large differences in the outputs of the view_reductions when num_threads>1. I suspect the p…

jgfouca updated 12 months ago
5
pytorch/pytorch #135126

"RuntimeError: CUDA error: operation not supported" fixed by…

### 🐛 Describe the bug After #134373 I started getting the error "RuntimeError: CUDA error: operation not supported" when trying to run pytorch. Fresh build from source succeeds before #134373 and f…

aorenste updated 2 months ago
4
clMathLibraries/clFFT #153

For 2D and 3D transforms it is not possible to supply all te…

It would be great to allow the user to supply all needed temporary buffers for 2D and 3D transforms. Currently internal 1D transforms allocate their only temporary buffers. This is a problem when baki…

TillAlex updated 7 years ago
4
Lightning-AI/lightning-thunder #348

Distributed and Bucketing Performance Improvements

## 🐛 Bug This is a lengthy issue/post detailing my observations with our distributed and bucketing performance. Some of these are actionable items and some are just observations to be aware of. …

parthmannan updated 5 months ago
2
Fraunhofer-AISEC/archie #42

Use explicit data types with pandas

Pandas may sometimes choose to convert large integers (close to 64-bit integer limit) to floats. This is very common on register dataset on 64-bit architecture and is fixed with #41. Similar situation…

lukasauer updated 2 years ago
2

上一页 1...13 14 15 16 17 18 19...100 下一页

1000+ results for large-kernels

1000+ results
for large-kernels