smem Search Results - Githubissues

state-spaces/mamba #601

Query regarding indexing of smem_delta_a

From what I understand, `smem_delta_a` is used to initialize the value of `delta_a` in shared memory in these lines: https://github.com/state-spaces/mamba/blob/bc84fb1172e6dea04a7dc402118ed19985349e9…

SudhanshuBokade updated 1 month ago

Dao-AILab/flash-attention #1318

Why is barrier_O Necessary, and What is it Waiting For?

Why do we need this barrier_O? What exactly is it waiting for? For each read of V/K, we have pipeline and producer_commit to control that flow. Also, the placement of barrier_O seems strange—it appear…

ziyuhuang123 updated 3 weeks ago

samuelcolvin/jinjahtml-vscode #158

C++ Formatting + Macro Parsing Error

I've got a file that I'm editing as with `jinja-cpp` formatter. I import a macro from another file with: ``` {%- from 'macros.jinja' import declare_smem_arrays with context %} ``` The mac…

asglover updated 1 month ago

NVIDIA/cutlass #1953

[QST] make_tiled_copy_B generates incompatible layouts

**What is your question?** Hello! I am writing an int8 GEMM layer using cute. I use `MMA_Atom` as my atom MMA, and define my tiled MMA as: ``` using TiledMma = TiledMMA< MMA_Atom_Arch, …

phantaurus updated 1 day ago

llvm/circt #7834

[FIRRTL] smem with read address from port does not work

Consider the following FIRRTL: ```firrtl FIRRTL version 4.0.0 circuit Top : public module Top : input clock : Clock input raddr : UInt input waddr : UInt input wdata : UInt…

jackkoenig updated 1 week ago

NVIDIA/cutlass #1882

[QST] Why is there bank conflict in this simple layout?

I modified the `tiled_copy.cu` example in cute/tutorial to use the following layout ``` auto tensor_shape = cute::Shape{}; auto block_shape = cute::Shape{}; ... Tensor tensor_S = make_tensor(m…

seanxwzhang updated 1 week ago

Dao-AILab/flash-attention #1266

Where in the code demonstrate inter-warp policy?

``` template CUTLASS_DEVICE void mma(Params const& mainloop_params, MainloopPipeline pipeline_k, MainloopPipeline pipeline_v, PipelineState& smem_pipe_read_k…

ziyuhuang123 updated 1 month ago

NVIDIA/cutlass #1866

[QST] Incorrect matrix multiplication result with CuTe libra…

**Description:** I encountered an issue when using the CuTe library for matrix multiplication. The output result does not match the expected values, and there are unexpected odd numbers like 27 and 3…

kimiwu0 updated 2 weeks ago

spcl/dace #1284

Broadcasts to Shared Memory on GPU Runs in Serial

**Describe the bug** When running code on a GPU, if you have a block of shared memory and you broadcast a variable to it, the generated CUDA assigns in serial. In my reproducer, the performance and b…

computablee updated 4 weeks ago

JaehyeokHan/Smartphone-Backup-Data-Extractor #3

.smem files?

hi I saw that samsung phone messages from 2017 and more have the .smem format (surely a derivative of the previous ones), is it possible that this format can be integrated as well? best regards

7zxkv updated 1 year ago

1000+ results for smem

1000+ results
for smem