-
In training, I used swiglu, TP=4, PP=2. I use deepspeed_to_deepspeed.py to convert the checkpoint to a TP=1, PP=1 one. When evaluating the obtained checkpoint, it is found that the accuracy is inconsi…
-
### 🐛 Describe the bug
when I load model with AutoLigerKernelForCausalLM ,I get ValueError: Pointer argument (at 0) cannot be accessed from Triton (cpu tensor?)
when load mdoel Apply Model…
-
- [ ] Different Types of activation function
-
Getting this error after installing on Windows 11;
```
| WARNING | xformers | WARNING[XFORMERS]: xFormers can't load C++/CUDA extensions. xFormers was built for:
PyTorch 2.1.0+cu121 with CUDA…
-
When run via jupyter :
WARNING[XFORMERS]: xFormers can't load C++/CUDA extensions. xFormers was built for:
PyTorch 2.0.1+cu118 with CUDA 1108 (you have 2.0.1+cpu)
Python 3.10.11 (you hav…
-
I can launch it, but the generation always stops early. This might be the cause:
```
Launching Web UI with arguments: --enable-insecure-extension-access --share --disable-safe-unpickle --theme dar…
-
# ❓ Questions and Help
I keep trying to install this thing to get rid of the errors when I run Automatic1111, but this stupid thing keeps installing outdated torch, and uninstalling my up to date tor…
-
https://github.com/linkedin/Liger-Kernel/blob/58fd2bc85073fdb010164426c9b159cd8a0e9542/src/liger_kernel/ops/swiglu.py#L59-L60
Lets take a custom autograd function:
```python
class Exponential(tor…
-
# 🐛 Bug
## To Reproduce
Install torch (direct from https://pytorch.org/get-started/locally/ )
pip3 install --pre torch torchvision torchaudio --index-url https://download.pytorch.org/whl/nightly/…
-
https://note.com/retrieva/n/n715bea2c2cd1