issues
search
NVIDIA
/
apex
A PyTorch Extension: Tools for easy mixed precision and distributed training in Pytorch
BSD 3-Clause "New" or "Revised" License
8.16k
stars
1.35k
forks
source link
issues
Newest
Newest
Most commented
Recently updated
Oldest
Least commented
Least recently updated
Increase tolerance to workaround unit test failures on A100
#1766
nWEIdia
closed
5 months ago
0
64-bit indexing Adam
#1765
eqy
closed
6 months ago
0
apex installation failures
#1764
momo1986
opened
6 months ago
1
Installation instructions don't build/install the C modules
#1763
zxti
opened
6 months ago
2
Apex installation fails
#1762
yang606
opened
6 months ago
1
Cannot install apex on the machine of CUDA 12.2
#1761
momo1986
opened
6 months ago
6
Make fused normalization functions backward-compatible
#1760
timmoon10
closed
6 months ago
2
[contrib] Improve FusedAdamSWA interface and add unit tests
#1759
lirundong
closed
6 months ago
1
add async copy for openfold swa triton kernel
#1758
azazhu
closed
6 months ago
0
No module named 'amp_C' error for py3.9 pytorch2.1.0 cuda12.1
#1757
rocke2020
closed
7 months ago
1
Fused RoPE for `thd` format
#1756
yaox12
closed
5 months ago
1
ModuleNotFoundError: No module named 'fused_layer_norm_cuda', ubuntu 22.04, Successfully installed apex-0.1
#1755
dhamaraiselvi
opened
7 months ago
3
Use recommended PyTorch methods to silence warnings
#1754
deepakn94
closed
7 months ago
0
why a kernel like CUDAFunctor_add appears when testing MixedFusedRMSNorm?
#1753
HangJie720
opened
7 months ago
0
[FusedRoPE] Fuse type conversion and cos/sin
#1752
yaox12
closed
7 months ago
1
Avoid `.contiguous()` in fused RoPE
#1751
yaox12
closed
7 months ago
0
[Bug] Fix a bug in fused rope
#1750
yaox12
closed
7 months ago
0
Distributed optimizer support for contiguous param buffer with FP8 params
#1749
timmoon10
closed
7 months ago
1
Whether to support Cuda 12.1
#1748
yangzhipeng1108
opened
7 months ago
4
Misc Changes
#1747
nWEIdia
closed
7 months ago
1
A fused `apply_rotary_pos_emb` implementation for Megatron-Core
#1746
yaox12
closed
7 months ago
0
More Precision Combinations For GroupNorm
#1745
alpha0422
closed
8 months ago
0
GPU memory leak with Flair and APEX
#1744
astropic
opened
8 months ago
0
Fix `rtol` in `assert_close` cleanup
#1743
eqy
closed
8 months ago
0
Cleanup usage of `self.assertTrue(torch.allclose(...`
#1742
eqy
closed
8 months ago
0
ninja: error: '/app/csrc/amp_C_frontend.cpp', needed by '/app/build/temp.linux-x86_64-cpython-310/csrc/amp_C_frontend.o', missing and no known rule to make it
#1741
Tolga-Karahan
closed
8 months ago
0
Loop through all available engines for cuDNN heuristics search
#1740
minitu
closed
8 months ago
1
add test for openfold triton mha kernel
#1739
azazhu
closed
8 months ago
0
error: command '/usr/local/cuda-11.3/bin/nvcc' failed with exit code 1
#1738
Brion112233
opened
9 months ago
1
When doing pip install -v --no-cache-dir --global-option="--cpp_ext" --global-option="--cuda_ext" ./, shows ModuleNotFoundError: No module named 'packaging'
#1737
lainmn
opened
9 months ago
8
fused_layer_norm_cuda.rms_forward_affine gives runtime error when run on cuda:1
#1736
Kushdesh
opened
9 months ago
0
Installation fails (due to recent change?)
#1735
hector-gr
opened
9 months ago
12
Add openfold triton code
#1734
ar-nowaczynski
closed
9 months ago
0
Add hysteresis support for AMP gradient scale update
#1733
minitu
closed
9 months ago
1
Rui/dev fast ln
#1732
RuiWang1998
closed
9 months ago
0
Use master weights for bfloat16 FusedAdam when master_weights=True
#1731
cbcase
opened
9 months ago
2
is it possible to update `conda-forge/nvidia-apex` to a recent tag?
#1730
stas00
closed
9 months ago
2
torch1.13.1 cuda11.6 python3.8 TypeError: unsupported operand type(s) for +: 'NoneType' and 'str'
#1729
yuhuai4554
opened
9 months ago
2
FusedAdam doesn't allocate master weights for bfloat16
#1728
cbcase
opened
9 months ago
2
Add multi_tensor_unscale_l2norm_cuda
#1727
minitu
closed
9 months ago
1
Rui/dev fast ln
#1726
RuiWang1998
closed
9 months ago
0
Option to only build `amp_C` module
#1725
ezhang887
opened
10 months ago
0
torch2.0.1 No module named 'torch._six
#1724
darrenwang00
opened
10 months ago
11
Distributed optimizer infrastructure for FP8 parameters
#1723
timmoon10
closed
9 months ago
0
Apex is not correctly built for pytorch 2.1.0
#1722
acphile
opened
10 months ago
2
Distributed optimizer support for multiple dtypes
#1721
timmoon10
closed
10 months ago
0
[contrib.xentropy] bfloat16 support
#1720
crcrpar
closed
10 months ago
0
Return distributed optimizer checkpoint on all ranks
#1719
timmoon10
closed
10 months ago
0
Adjusting test for ONNX opset 18 (now default)
#1718
borisfom
closed
10 months ago
0
ModuleNotFoundError: No module named 'fast_multihead_attn'
#1717
ICENacl
opened
10 months ago
4
Previous
Next