-
# 🚀 Feature
We need one kind of AttentionBias like BlockDiagonalCausalMask, but with some optional padding.
## Motivation
When training LLM, training data may be packed. It may look like
…
-
### 🐛 Describe the bug
In my application, I need to take the nth order mixed derivative of a function. However, I found that the torch.autograd.grad computation time increases exponentially as n incr…
-
## 资源
- The Incredible PyTorch: a curated list of tutorials, papers, projects, communities and more relating to PyTorch. [[github]](https://github.com/ritchieng/the-incredible-pytorch)
-
Berlin conference hackathon follow up
-
Hi,
I am running into a problem which has a table and it contains columns and rows. Number of columns are fixed but number of rows are dynamic. I need to find out certain value of the cell if it is…
-
# Background
The COVID Tracking Project was founded in the early days of the COVID pandemic arriving in the US, and provided an API from day one. This API receives millions of requests per day, and…
kevee updated
4 years ago
-
TF provides the `TensorArray` to make automatic iteration and stacking efficient in `scan` or `while_loop`.
The naive variant with gathering and concatenating or dynamic updates would be inefficien…
-
### 🐛 Describe the bug
The keys of a `ModuleDict` cannot have the same name as existing `ModuleDict` class attributes:
```python
import torch
torch.nn.ModuleDict({'type': torch.nn.Module()})…
-
I see APIs have been proposed like this before but all I want is a method to _try_ to unsplit a `Bytes` as I think that should be possible in many cases, and if it fails the user can choose to perform…
-
### Description
Hi,
I am running a scan over several thousand steps with a 2D array as the carry, where each step consists of (among other operations):
- inserting elements into a row at a given …