aidos-lab / pytorch-topological

A topological machine learning framework based on PyTorch
https://pytorch-topological.readthedocs.io/
BSD 3-Clause "New" or "Revised" License
157 stars 21 forks source link

CubicalComplex of 1d feature vs 2d image #10

Closed dgm2 closed 2 years ago

dgm2 commented 2 years ago

Thanks for the useful framework. I have two questions about processing images.

Is it advised to pass an image through a linear layer (convolve) and then get the persistence, or is it better to to in the reverse order?

if I have a [28,28] tensor image X, I can do directly

cc = CubicalComplex()
pers = cc(X)

if I have a representation Z, e.g. a tensor of dimension 128 outputted from a resnet is this a correct way to proceed, or would you advise a better solution ?

cc = CubicalComplex()
pers = cc(Z.view(1, -1))

Many thanks

Pseudomanifold commented 2 years ago

Thanks for reaching out! The situation you are describing leads indeed to slightly different semantics: in the first case (X being passed directly to the cubical complex calculation) you get topological features belonging directly to the image. In the second case (X being a representation obtained from another neural net), you would 'coerce' the tensor to be treated as an image. Without knowing the exact semantics of your use case, I'd say that in the second case, you probably want to try out a VietorisRipsComplex rather than a CubicalComplex. Hope that helps; feel free to reopen for further discussions!

(PS: You can also pass batched tensors of images to CubicalComplex; this is unrelated to your question, but it might speed up things in practice)

dgm2 commented 2 years ago

Thanks for the reply,

I am trying to model the following scenario:

I would like to

however for VR I am getting empty arrays when processing a single image. see the example. Any ideas if there is a way to fix it, or if this does not make sense? Many thanks!

import numpy as np
from torchvision.datasets import MNIST

from torch_topological.nn import CubicalComplex, VietorisRipsComplex
from torch_topological.nn import WassersteinDistance

import torch
from torchvision.transforms import Compose, ToTensor, Normalize

import torch.nn as nn
import torch.nn.functional as F

from torch.utils.data import Dataset, DataLoader, RandomSampler
torch.random.manual_seed(0)

class Net(nn.Module):
    def __init__(self):
        super(Net, self).__init__()
        self.conv1 = nn.Conv2d(1, 32, 3, 1)
        self.conv2 = nn.Conv2d(32, 64, 3, 1)
        self.dropout1 = nn.Dropout(0.25)
        self.dropout2 = nn.Dropout(0.5)
        self.fc1 = nn.Linear(9216, 128)
        self.fc2 = nn.Linear(128, 128)

    def forward(self, x):
        x = self.conv1(x)
        x = F.relu(x)
        x = self.conv2(x)
        x = F.relu(x)
        x = F.max_pool2d(x, 2)
        x = self.dropout1(x)
        x = torch.flatten(x, 1)
        x = self.fc1(x)
        x = F.relu(x)
        x = self.dropout2(x)
        x = self.fc2(x)
        output = F.log_softmax(x, dim=1)
        return output

if __name__ == '__main__':
    device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
    print('Device', device)

    np.random.seed(23)

    data_set = MNIST(
        '../../data/',
        download=True,
        train=False,
        transform=Compose(
            [
                ToTensor(),
                Normalize([0.5], [0.5])
            ]
        ),
    )
    batch_size = 1
    loader = DataLoader(
        data_set,
        batch_size=batch_size,
        shuffle=False,
        drop_last=True,
        sampler=RandomSampler(
            data_set,
            replacement=True,
            num_samples=100
        )
    )
    cc = CubicalComplex()
    vr = VietorisRipsComplex(dim=0)
    net = Net()
    x1 = loader.__iter__().next()[0]
    x2 = loader.__iter__().next()[0]
    z1 = net(x1)
    z2 = net(x2)
    pers_vr1 = vr(z1)
    pers_vr2 = vr(z2)
    pers_cc1 = cc(z1)
    pers_cc2 = cc(z2)

    ls = WassersteinDistance(q=2)
    loss1 = ls(pers_vr1[0], pers_vr2[0])  # empty arrays - distance is zero
    loss2 = ls(pers_cc1[0], pers_cc2[0])
    print("vr", loss1)
    print("cc", loss2)
Pseudomanifold commented 2 years ago

I think I understand some of the issues here: with a batch size of 1, the Vietoris–Rips complex is not a good choice because it calculates the topology of a single point (which is very uninformative). I updated the script to use a higher batch size and now I get non-zero values for both representations:

import numpy as np
from torchvision.datasets import MNIST

from torch_topological.nn import CubicalComplex, VietorisRipsComplex
from torch_topological.nn import WassersteinDistance

import torch
from torchvision.transforms import Compose, ToTensor, Normalize

import torch.nn as nn
import torch.nn.functional as F

from torch.utils.data import Dataset, DataLoader, RandomSampler
torch.random.manual_seed(0)

class Net(nn.Module):
    def __init__(self):
        super(Net, self).__init__()
        self.conv1 = nn.Conv2d(1, 32, 3, 1)
        self.conv2 = nn.Conv2d(32, 64, 3, 1)
        self.dropout1 = nn.Dropout(0.25)
        self.dropout2 = nn.Dropout(0.5)
        self.fc1 = nn.Linear(9216, 128)
        self.fc2 = nn.Linear(128, 128)

    def forward(self, x):
        x = self.conv1(x)
        x = F.relu(x)
        x = self.conv2(x)
        x = F.relu(x)
        x = F.max_pool2d(x, 2)
        x = self.dropout1(x)
        x = torch.flatten(x, 1)
        x = self.fc1(x)
        x = F.relu(x)
        x = self.dropout2(x)
        x = self.fc2(x)
        output = F.log_softmax(x, dim=1)
        return output

if __name__ == '__main__':
    device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
    print('Device', device)

    np.random.seed(23)

    data_set = MNIST(
        '../../data/',
        download=True,
        train=False,
        transform=Compose(
            [
                ToTensor(),
                Normalize([0.5], [0.5])
            ]
        ),
    )
    batch_size = 1
    loader = DataLoader(
        data_set,
        batch_size=batch_size,
        shuffle=False,
        drop_last=True,
        sampler=RandomSampler(
            data_set,
            replacement=True,
            num_samples=100
        )
    )
    cc = CubicalComplex()
    vr = VietorisRipsComplex(dim=0)
    net = Net()
    x1 = loader.__iter__().next()[0]
    x2 = loader.__iter__().next()[0]
    z1 = net(x1)
    z2 = net(x2)
    pers_vr1 = vr(z1)
    pers_vr2 = vr(z2)
    pers_cc1 = cc(z1)
    pers_cc2 = cc(z2)

    ls = WassersteinDistance(q=2)
    loss1 = ls(pers_vr1[0], pers_vr2[0])  # empty arrays - distance is zero
    loss2 = ls(pers_cc1[0], pers_cc2[0])
    print("vr", loss1)
    print("cc", loss2)

From what you describe, I'd definitely use a Vietoris–Rips complex here. However, there is a problem with what you want to do: this representation only gives you a single set of persistence diagrams, so in some sense, there is only one Wasserstein distance per batch. From what you write, I don't fully get whether you want to work on individual images or on high-dimensional representations. If you want to work on individual images, the cubical complex is a good choice: here, you can have batches full of persistence diagrams.

The main issue appears to be a somewhat confusing notation: for VR, we are only talking about 'batches' if you have at least a 3D tensor. A 2D tensor will just be considered one point cloud; in your previous example, for instance, you provided a point cloud of shape (1, 128), i.e. a single point of that dimensionality.

Hope that makes sense so far!

dgm2 commented 2 years ago

the batch setup would be as follows. n = batch_size X = img_1 ... img_n dim = n x 28 x 28
Z = Net(X), z_1 .. z_n dim= n x 128 P = VR(Z) or CC(Z), p_1 ... p_n ( n persistence diagrams - one for each image)

then, I form some pairs of single image diagrams (e.g. (p1, p2), (p3,p4), and I want to compute Wasserstein distance for each pair. The corresponding operation is having 2 vectors and taking their difference (distance), this is still a vector of the same size.

However, there is a problem with what you want to do: this representation only gives you a single set of persistence diagrams, so in some sense, there is only one Wasserstein distance per batch

I guess this answers (negatively) my related question: given 2 sets of n persistence diagrams, can I compute n Wasserstein distances (e.g. one per pair), instead of returning a single item, as currently? this is why I am processing images one by one (e.g. batch size = 1 ).

Pseudomanifold commented 2 years ago

Yes, that does not work as anticipated, but it's a semantic problem—persistent homology typically answers questions on the level of a data set/point cloud. If you want one diagram for each image, but your output is only one vector for each image, you cannot create a non-empty persistence diagram for that. If you really want to do that, you could calculate a cubical complex for your reshaped output (as you originally mentioned), but I am not sure what this will get you, i.e. I don't understand what such features would mean.

I'll look into your other question—I agree that it should be possible to run distance computations on the level of batches (and I thought I had implemented it that way already). For instance, in another project, I used the following construction to obtain multiple distances for each batch:

distances = torch.stack([
        distance(pred_batch, true_batch)
        for pred_batch, true_batch in zip(pers_info_pred, pers_info_true)
    ])

Hope that helps!