smem Search Results - Githubissues

1000+ results
for smem

Best match

Best match Most commented Newest Recently updated Least commented Oldest Least recently updated

sablier-labs/docs #131

Enable Vercel Analytics

PaulRBerg updated 9 months ago
8
alpaka-group/alpaka #1648

Querying the size of the dynamic shared memory

At least for debugging purposes it would be useful to be able to query the the size of the dynamic shared memory from within the device code. In CUDA this can be done with some inline PTX (see http…

fwyzard updated 2 years ago
2
NVIDIA/cccl #874

[RFE] Use cudaLaunchKernel instead of <<<>>>

The main reason for this request is to improve error handling. When using , CUB currently has to call [cudaPeekAtLastError](https://github.com/NVIDIA/cub/blob/866c576c118ae036fb5c2759ba1e5997967e817c/…

benbarsdell updated 1 year ago
4
Dao-AILab/flash-attention #1107

[QST] flash_attn2: why tOrVt is no swizzle ?

In the code at [this link](https://github.com/Dao-AILab/flash-attention/blob/main/csrc/flash_attn/src/flash_fwd_kernel.h#L180), the line reads: ``` Tensor tOrVt = thr_mma.partition_fragment_B(sVtNoS…

itsliupeng updated 2 weeks ago
3
Dao-AILab/flash-attention #1068

flash-attn3 supported L20?

l20 is modified from the H100 architecture and also has FP8 capability. Does flash-attn3 support it?

Xiaoyiyong555 updated 4 months ago
10
bwa-mem2/bwa-mem2 #167

Re-allocating SMEM data structures due to enc_qdb - Segmenta…

Hello, I try to map 10K sequences of size 10K bases each [(input)](https://drive.google.com/file/d/1m_uoL-0ICD2b8uDWjsIZhHoG9f3hFcBZ/view?usp=sharing) I execute like this: ./bwa-mem2 mem -t 1 pre…

ChristosMatzoros updated 1 year ago
8
oppiliappan/curie #1

Size

Is it font big enough? Maybe we should maintain a version that's about as big as GohuFont 14? Preview: https://0x0.st/sMEm.png

oppiliappan updated 6 years ago
9
facebookresearch/faiss #3207

GpuIndexIVFScalarQuantizer with quantizers that require shar…

# Summary GpuIndexIVFScalarQuantizer with scalar quantizers that require shared memory on the GPU don't seem to work for k >= 1024 in Faiss 1.7.4. See the small reproduction script at the bottom. …

gabuzi updated 5 months ago
3
NVIDIA/trt-samples-for-hackathon-cn #91

Cuda runtime error in trt-llm gemm

环境 If applicable, please include the following: CPU architecture: x86_64 GPU properties GPU name: NVIDIA A10 Clock frequencies used: None Libraries TensorRT branch: 9.0.0 TensorRT LLM: 0.1.3…

yuanjiechen updated 10 months ago
1
arthur-zhang/morning-up-up #7

1104 分享：VMA、PageFault、VSS、RSS、PSS、USS

- 进程的每一段虚拟地址空间就是一个 VMA - 发生 pagefault 的几种可能 - 进程占用多少内存，怎么算才合理，Vss Rss Pss Uss，smem 工具 - GDB 初步

arthur-zhang updated 4 years ago
6

上一页 1...6 7 8 9 10 11 12...100 下一页

1000+ results for smem

1000+ results
for smem