issues
search
NVIDIA
/
Fuser
A Fusion Code Generator for NVIDIA GPUs (commonly known as "nvFuser")
Other
271
stars
53
forks
source link
issues
Newest
Newest
Most commented
Recently updated
Oldest
Least commented
Least recently updated
LoadStoreOp can have scalar inputs, so type check is required
#3359
naoyam
closed
2 weeks ago
2
Fix typecast error in SelectOpRecord
#3358
rdspring1
closed
2 weeks ago
1
Set `fusion_id` and `device_id` of `KernelExecutor` in constructor for user scheduling.
#3357
rdspring1
closed
1 week ago
2
Fix tests with DEBUG_SERDE=disable
#3356
csarofeen
closed
2 weeks ago
1
Cleanup warp specialization kernel
#3355
zasdfgbnm
closed
2 weeks ago
0
Prototype warp specialization
#3354
zasdfgbnm
opened
2 weeks ago
0
Fix serialization errors in executor dispatch
#3353
rdspring1
closed
2 weeks ago
1
Move allocOutputSpace to multidevice/executor.cpp
#3352
wujingyue
closed
2 weeks ago
1
[DO NOT REVIEW] Executor dispatch with 3349
#3351
naoyam
closed
2 weeks ago
0
Modify the `compile` parameter in baseline benchmarks to `executor`
#3350
Priya2698
closed
2 weeks ago
2
Rename FusionExector to KernelExecutor, fe to ke, fec to executor_cache
#3349
naoyam
closed
2 weeks ago
6
Remove use of RECORD_FUNCTION.
#3348
wujingyue
closed
2 weeks ago
1
replaceSymbolicSizes needs to process the first appearing ID in the same way as the rest of the IDs
#3347
naoyam
closed
2 weeks ago
3
replaceSymbolicSizes
#3346
naoyam
closed
2 weeks ago
4
Ensure expression simplification is enabled in proveLinearAndGetStride
#3345
jacobhinkle
closed
2 weeks ago
2
Prefer simpler vals in replace sizes
#3344
naoyam
closed
2 weeks ago
3
Avoid replacing a Val with a dependent Val
#3343
naoyam
closed
2 weeks ago
1
Permute the inputs/outputs of runFusionWithInputs.
#3342
wujingyue
opened
2 weeks ago
0
Rename transformOutputFromAllocationToLogical
#3341
wujingyue
closed
2 weeks ago
1
Make prepareInputs private.
#3340
wujingyue
closed
2 weeks ago
1
Fix autotune_pointwise.py script
#3339
rdspring1
closed
2 weeks ago
1
Sequence Parallel Forward Transformer
#3338
cowanmeg
closed
2 days ago
4
Remove MatmulParams::rotate_ldmatrix_out_of_main_loop
#3337
jacobhinkle
closed
2 weeks ago
3
revise 2d inner reduction heuristics
#3336
liqiangxl
closed
6 days ago
0
Translate segments to python definition
#3335
rdspring1
closed
6 days ago
5
Build Segments for User Schedule Segmentation
#3334
rdspring1
closed
1 week ago
2
refactor 2d inner reduction heuristics
#3333
liqiangxl
closed
1 week ago
5
clean 2d inner reduction heuristics
#3332
liqiangxl
closed
2 weeks ago
0
clean 2dInnerReductionHeuristic
#3331
liqiangxl
closed
1 week ago
5
split inner reduction heuristics into 2d and 3d heuristics
#3330
liqiangxl
closed
1 week ago
3
use static bdimx & bdimy in inner reduction
#3329
liqiangxl
closed
6 days ago
0
Print all outputs in `SdpaFwdOp`
#3328
Priya2698
closed
2 weeks ago
2
Use IterDomain::split and IterDomain::merge
#3327
naoyam
closed
2 weeks ago
6
Clean uses of unique_ptr.
#3326
wujingyue
closed
2 weeks ago
2
add knobs control inner dim unroll and outer dim unroll in pointwise scheduler redo pr-3275 to check code changes
#3325
liqiangxl
closed
2 weeks ago
3
Add cuda arch guard to skip ampere matmul tests on Hopper GPUs
#3324
rdspring1
closed
2 weeks ago
1
Make it more explicit about the rfactor flag propagation
#3323
naoyam
closed
3 weeks ago
3
Remove MmaOpDetails::input_layout and getInputLayout
#3322
jacobhinkle
closed
3 weeks ago
1
Adding resize(PadOp) vectorization analysis
#3321
jjsjann123
closed
1 week ago
13
Load mma operands to shared memory with TMA
#3320
rdspring1
closed
1 week ago
7
Fix `is_clonable` in `test_issue_3292`
#3319
rdspring1
closed
3 weeks ago
1
Perf mm 20241030
#3318
zasdfgbnm
opened
3 weeks ago
0
[WIP] Use the AlmostExact map when traversing across multiple TV ops
#3317
naoyam
closed
3 weeks ago
1
Fix IterDomain::merge with expanded inner input
#3316
naoyam
closed
3 weeks ago
1
Update !build and !test triggers for CI pipelines; add a CI hello message for pull requests
#3315
xwang233
closed
3 weeks ago
1
Use a single elect sync ite for all trasactions
#3314
zasdfgbnm
closed
3 weeks ago
1
Handle empty tensors during definition of cat
#3313
jacobhinkle
opened
3 weeks ago
3
Add script to check for non-determinism
#3312
jacobhinkle
closed
3 weeks ago
2
[DO NOT MERGE] Test codediff in CI
#3311
jacobhinkle
opened
3 weeks ago
5
Add TT, TN, NT, NN tests for HopperMultipleMatmulScheduler
#3310
rdspring1
closed
2 weeks ago
2
Previous
Next