-
**Description**
A clear and concise description of what the bug is.
```
triton_python_backend_stub /tensorrt/triton-repos/trtibf-Trendyol-LLM-7b-chat-v1.0/preprocessing/1/model.py triton_python_bac…
-
Currently, `apex_ex` has special implementation for `cross_entropy` and `fused_rms_norm` (with a registered lookaside) from apex.
If we find that the `cross_entropy` from apex is faster than curre…
-
-
## 論文リンク
https://arxiv.org/abs/1907.11692
## 公開日(yyyy/mm/dd)
2019/07/26
## 概要
BERT の事前学習を様々な観点から検証・実験して original の BERT が undertrained であることを発見し、optimize して学習した結果、XLNet など BERT 以降に提案されたモデルと同等…
-
# 🌟 New model addition
## Model description
This is a version of EleutherAI's GPT-J with 6 billion parameters that is modified so you can generate and fine-tune the model in colab or equivalent …
-
### 🐛 Describe the bug
## Description
Before https://github.com/pytorch/pytorch/pull/123732, when running with AOTI, the SDPA pattern can be hit and `torch.ops.aten._scaled_dot_product_flash_atten…
-
# Plans for distilgpt2-medium and distilgpt2-large
## Motivation
While distilgpt2 is useful, I was wondering if there are any plans to create a distilgpt2-medium and distilgpt2-large. I'm also won…
-
This is a ticket to track a wishlist of items you wish LiteLLM had.
# **COMMENT BELOW 👇**
### With your request 🔥 - if we have any questions, we'll follow up in comments / via DMs
Respond …
-
### This issue is to have a centralized place to list and track work on adding support to new ops for the MPS backend.
[**PyTorch MPS Ops Project**](https://github.com/users/kulinseth/projects/1/vi…
-
While our [draft charter](https://www.w3.org/2023/03/proposed-webmachinelearning-charter.html) says that the group:
> priority on building blocks required by well-known model architectures such as re…