pytorch / pytorch

Tensors and Dynamic neural networks in Python with strong GPU acceleration
https://pytorch.org
Other
82.83k stars 22.33k forks source link

named flatten segfaults when dims is () #61137

Closed Gurkenglas closed 3 years ago

Gurkenglas commented 3 years ago
gurkenglas@Gurkenglas-PC ~/mutual-information (main) [SIGSEGV]> python3.9
Python 3.9.5 (default, May 19 2021, 11:32:47) 
[GCC 9.3.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import torch
>>> torch.empty(2,3).flatten((),"asd")
fish: 'python3.9' terminated by signal SIGSEGV (Address boundary error)

Environment

gurkenglas@Gurkenglas-PC ~/mutual-information (main) [2]> python3.9 collect_env.py Collecting environment information... PyTorch version: 1.10.0.dev20210630+cpu Is debug build: False CUDA used to build PyTorch: None ROCM used to build PyTorch: N/A

OS: Ubuntu 20.04.2 LTS (x86_64) GCC version: (Ubuntu 9.3.0-17ubuntu1~20.04) 9.3.0 Clang version: Could not collect CMake version: Could not collect Libc version: glibc-2.31

Python version: 3.9.5 (default, May 19 2021, 11:32:47) [GCC 9.3.0] (64-bit runtime) Python platform: Linux-5.4.72-microsoft-standard-WSL2-x86_64-with-glibc2.31 Is CUDA available: False CUDA runtime version: No CUDA GPU models and configuration: No CUDA Nvidia driver version: No CUDA cuDNN version: No CUDA HIP runtime version: N/A MIOpen runtime version: N/A

Versions of relevant libraries: [pip3] numpy==1.21.0 [pip3] torch==1.10.0.dev20210630+cpu [pip3] torch-utils==0.0.1 [pip3] torchaudio==0.10.0.dev20210630 [pip3] torchvision==0.11.0.dev20210630+cpu [conda] Could not collect

Imagine the API working like this.

from lenses import bind

def nflatten(self, **kwargs):
    for name,olds in kwargs.items():
        olds = tuple(bind(olds).Recur(str).collect())
        self = self.align_to(..., *olds).flatten(olds, name) if olds else self.rename(None).unsqueeze(-1).rename(*self.names, name)
    return self

def nunflatten(self, **kwargs):
    for name,news in kwargs.items():
        news = tuple(bind(news).Each().collect())
        self = self.unflatten(name, news) if news else self.squeeze(name)
    return self

torch.Tensor.nflatten = nflatten
torch.Tensor.nunflatten = nunflatten

Also fixes #61117, and why would flattened dimensions need to be consecutive?

There should be a higher-order function that easily expands operator coverage to a given function. Perhaps an automatically generated module that applies it to all the functions.

cc @ezyang @gchanan @zou3519 @bdhirsh @jbschlosser @anjali411

soulitzer commented 3 years ago

@ngimel we should take a more systematic approach to this

mattip commented 3 years ago

The segfault is coming from this line where positions.size() ==0 but the code checks positions[1].

I think the solution to #61117 (in case of renaming dimensions) the operators should return a view or refuse to rename if a view is impossible. The table listing operator behaviour with named dimensions mentions "See documentation" for "Named inference rule", but the documentation does not mention named dimensions.

While the comment about a systematic approach is appropriate, it seems this use of dimnames_to_positions() is exceptional.