simt Search Results - Githubissues

495 results
for simt

Best match

Best match Most commented Newest Recently updated Least commented Oldest Least recently updated

NVIDIA/cutlass #1672

[FEA] Passing the copy type as a parameter to `copy`

Kind of a small change. So I was looking at https://github.com/NVIDIA/cutlass/issues/1231 and I was wondering if it made sense to refactor the code so that it will accept the type of copy they w…

ZelboK updated 2 months ago
5
ZaidQureshi/bam #38

build error on ubuntu 24.04 after following the build guide

**Describe the bug** A clear and concise description of what the bug is. it is build error **To Reproduce** Steps to reproduce the behavior: following the guide to make libnvm on ubuntu24.04 **E…

gaowayne updated 3 weeks ago
5
lcai2/LightTEA #1

请问为什么我运行的结果和论文的不一致？

![Snipaste_2023-12-23_13-30-41](https://github.com/lcai2/LightTEA/assets/76097676/c577f04d-3b99-4731-b5b5-9ca88e3d02db) for i in range(ranks.shape[0]): if sims[i,ranks[i,0]] > 0.8: …

w-oo updated 10 months ago
2
yukarinoki/reseach #16

AStitch: enabling a new multi-dimensional optimization space…

#15 のような感じ = 機械学習最適化JITコンパイラの工夫について書いてある、#15より進んでいるのかな高コストな融合をするか、融合をしないで大量のカーネルを出すかというジレンマがある（ジャストインタイム制約があるので） AStichという最適化コンパイラを作った、Tensorflowから使うらしい。 Stitchは、4つのオペレータステッチングスキームを体系的に抽象化し、…

yukarinoki updated 1 year ago
2
facebookresearch/xformers #604

Unable to Build from latest

# 🐛 Bug ## Command ```sh cd xformers git pull git submobule update --recursive --remote pip install -e . ``` ## To Reproduce Steps to reproduce the behavior: 1. pull latest…

JonnoFTW updated 1 year ago
3
FindDefinition/cumm #3

ImportError: cannot import name 'TensorOpParams' from 'cumm.…

Hi, I built **cumm** from source on Nvidia Jetson nano board, when I import **cumm** inside python, no errors appear. But when import **TensorOpParams** as following: `from cumm.gemm.algospec.core im…

SmBito updated 2 years ago
2
taichi-dev/taichi #4635

[lang] Use Allocas for TLS instead of a memory pointer

Thread local memory pointer is really only a thing on CUDA iirc. Hardware wise all thread local memory are in the registers, as the first level of cache will be the shared memory. From the code it see…

bobcao3 updated 2 years ago
1
sandzja/TreninuGramata #26

Treniņu plāna attēlošanā izmantot taisnas formas intervāla b…

Šis basecamp ir atzīmēts kā izdarīts. Lapā izmaiņas neredzu.

sandzja updated 12 years ago
2
jimpelton/clpp #6

What's the difference between "default scan" and "GPU scan"?

``` I am new to clpp. Just want to know what is the difference between these two scans? [I did not find a better place to put this question] ``` Original issue reported on code.google.com by `rong…

GoogleCodeExporter updated 8 years ago
2
NVIDIA/cutlass #1840

[QST] wgmma.mma_async instructions are serialized due to ins…

When I build the latest cutlass library for 90a, I see a lot of warnings like: ``` ptxas info : (C7511) Potential Performance Loss: wgmma.mma_async instructions are serialized due to insufficient r…

vmarkovtsev updated 4 weeks ago
4

上一页 1...2 3 4 5 6 7 8...50 下一页

495 results for simt

495 results
for simt