FenTechSolutions / CausalDiscoveryToolbox

Package for causal inference in graphs and in the pairwise settings. Tools for graph structure recovery and dependencies are included.
https://fentechsolutions.github.io/CausalDiscoveryToolbox/html/index.html
MIT License
1.12k stars 197 forks source link

[SAM] RuntimeError: The size of tensor a (10) must match the size of tensor b (314) at non-singleton dimension 1 #110

Open insookim43 opened 3 years ago

insookim43 commented 3 years ago
obj = SAM_handson(num_hidden_generator=200, num_hidden_discriminator=200, train_epochs=100, test_epochs=30, batchsize=10, dagloss=True, verbose=True, nruns=1)
output = obj.predict(data, graph=skeleton_g)
#output = obj.predict(toy_dummy)

---------------------------------------------------------------------------
RuntimeError                              Traceback (most recent call last)
<ipython-input-30-ebf0bd99ce25> in <module>()
      1 obj = SAM_handson(num_hidden_generator=200, num_hidden_discriminator=200, train_epochs=100, test_epochs=30, batchsize=10, dagloss=True, verbose=True, nruns=1)
----> 2 output = obj.predict(data, graph=skeleton_g)
      3 #output = obj.predict(toy_dummy)

6 frames
<ipython-input-21-0098e2a71c5b> in predict(self, data, graph, return_list_results)
    106                                losstype=self.losstype,
    107                                hidden_layers=self.hidden_layers,
--> 108                                ) for i in range(self.nruns)]
    109 
    110         list_out = [i for i in results if not np.isnan(i).any()]

<ipython-input-21-0098e2a71c5b> in <listcomp>(.0)
    106                                losstype=self.losstype,
    107                                hidden_layers=self.hidden_layers,
--> 108                                ) for i in range(self.nruns)]
    109 
    110         list_out = [i for i in results if not np.isnan(i).any()]

<ipython-input-20-43884eca4be5> in run_SAM(input_data, skeleton, device, train, test, batch_size, lr_gen, lr_disc, lambda1, lambda2, num_hidden_generator, num_hidden_discriminator, tqdm, losstype, dagstart, dagloss, dagpenalization, dagpenalization_increase, hidden_layers, idx)
     77             generated_variables = sam(batch, noise,
     78                                       torch.cat([drawn_graph, noise_row], 0),
---> 79                                       drawn_neurons)
     80 
     81             disc_vars_d = discriminator(generated_variables.detach(), batch)

/usr/local/lib/python3.7/dist-packages/torch/nn/modules/module.py in __call__(self, *input, **kwargs)
    539             result = self._slow_forward(*input, **kwargs)
    540         else:
--> 541             result = self.forward(*input, **kwargs)
    542         for hook in self._forward_hooks.values():
    543             hook_result = hook(self, input, result)

<ipython-input-18-fae8b7226719> in forward(self, data, noise, adj_matrix, drawn_neurons)
     38         """
     39 
---> 40         output = self.output_layer(self.layers(self.input_layer(data, noise, adj_matrix * self.skeleton)), drawn_neurons)
     41         return output.squeeze(2)
     42 

/usr/local/lib/python3.7/dist-packages/torch/nn/modules/module.py in __call__(self, *input, **kwargs)
    539             result = self._slow_forward(*input, **kwargs)
    540         else:
--> 541             result = self.forward(*input, **kwargs)
    542         for hook in self._forward_hooks.values():
    543             hook_result = hook(self, input, result)

/usr/local/lib/python3.7/dist-packages/cdt/utils/torch.py in forward(self, input, adj_matrix, permutation_matrix)
    372 
    373         if adj_matrix is not None and permutation_matrix is not None:
--> 374             input_.append((input_[-1].transpose(0, 1) @ (adj_matrix.t().unsqueeze(2) * permutation_matrix)).transpose(0, 1))
    375         elif adj_matrix is not None:
    376             input_.append(input_[-1] * adj_matrix.t().unsqueeze(0))

RuntimeError: The size of tensor a (10) must match the size of tensor b (314) at non-singleton dimension 1
class SAM_handson(GraphModel):
    """SAM Algorithm.
    Args:
        lr (float): Learning rate of the generators
        dlr (float): Learning rate of the discriminator
        lambda1 (float): L0 penalization coefficient on the causal filters
        lambda2 (float): L0 penalization coefficient on the hidden units of the neural network
        num_hidden_generator (int): Number of hidden units in the generators' hidden layers(regularized with lambda2)
        num_hidden_discriminator (int): Number of hidden units in the discriminator's hidden layer
        train_epochs (int): Number of training epochs
        test_epochs (int): Number of test epochs (saving and averaging the causal filters)
        batch_size (int): Size of the batches to be fed to the SAM model. Defaults to full-batch.
        losstype (str): type of the loss to be used (either 'fgan' (default),'gan' or 'mse').
        hlayers (int): Defines the number of hidden layers in the discriminator.
        dagloss (bool): Activate the DAG with No-TEARS constraint.
        dagstart (float): Controls when the DAG constraint is to be introduced
           in the training (float ranging from 0 to 1, 0 denotes the start of
           the training and 1 the end).
        dagpenalisation (float): Initial value of the DAG constraint.
        dagpenalisation_increase (float): Increase incrementally at each epoch
           the coefficient of the constraint.
        linear (bool): If true, all generators are set to be linear generators.
        nruns (int): Number of runs to be made for causal estimation.
               Recommended: >=8 for optimal performance.
        njobs (int): Numbers of jobs to be run in Parallel.
               Recommended: 1 if no GPU available, 2*number of GPUs else.
        gpus (int): Number of available GPUs for the algorithm.
        verbose (bool): verbose mode
    """

    def __init__(self, lr=0.01, dlr=0.01, lambda1=0.01, lambda2=0.00001, num_hidden_generator=200, num_hidden_discriminator=200,
                 train_epochs=10000, test_epochs=1000, batchsize=-1,
                 losstype="fgan", dagstart=0.5, dagloss=True, dagpenalization=0,
                 dagpenalization_increase=0.001, hidden_layers=2,
                 njobs=None, gpus=None, verbose=None, nruns=1):

        """Init and parametrize the SAM model."""
        super(SAM_handson, self).__init__()
        self.lr = lr
        self.dlr = dlr
        self.lambda1 = lambda1
        self.lambda2 = lambda2
        self.num_hidden_generator = num_hidden_generator
        self.num_hidden_discriminator = num_hidden_discriminator
        self.train = train_epochs
        self.test = test_epochs
        self.batchsize = batchsize
        self.losstype = losstype
        self.dagstart = dagstart
        self.dagloss = dagloss
        self.dagpenalization = dagpenalization
        self.dagpenalization_increase = dagpenalization_increase
        self.hidden_layers = hidden_layers
        self.njobs = SETTINGS.get_default(njobs=njobs)
        self.gpus = SETTINGS.get_default(gpu=gpus)
        self.verbose = SETTINGS.get_default(verbose=verbose)
        self.nruns = nruns

    def predict(self, data, graph=None,
                return_list_results=False):
        """Execute SAM on a dataset given a skeleton or not.
        Args:
            data (pandas.DataFrame): Observational data for estimation of causal relationships by SAM
            skeleton (numpy.ndarray): A priori knowledge about the causal relationships as an adjacency matrix.
                      Can be fed either directed or undirected links.
        Returns:
            networkx.DiGraph: Graph estimated by SAM, where A[i,j] is the term
            of the ith variable for the jth generator.
        """
        if graph is not None:
             skeleton = torch.Tensor(graph) 

        else:
            skeleton = None

        assert self.nruns > 0

        results = [run_SAM(data, skeleton=skeleton,
                               lr_gen=self.lr,
                               lr_disc=self.dlr,
                               tqdm=self.verbose,
                               lambda1=self.lambda1, lambda2=self.lambda2,
                               num_hidden_generator=self.num_hidden_generator, 
                               num_hidden_discriminator=self.num_hidden_discriminator,
                               train=self.train,
                               test=self.test, batch_size=self.batchsize,
                               dagstart=self.dagstart,
                               dagloss=self.dagloss,
                               dagpenalization=self.dagpenalization,
                               dagpenalization_increase=self.dagpenalization_increase,
                               losstype=self.losstype,
                               hidden_layers=self.hidden_layers,
                               ) for i in range(self.nruns)]

        list_out = [i for i in results if not np.isnan(i).any()]

        try:
            assert len(list_out) > 0
        except AssertionError as e:
            print("All solutions contain NaNs")
            raise(e)

        W = sum(list_out)/len(list_out)

        return nx.relabel_nodes(nx.DiGraph(W),
                                {idx: i for idx,
                                 i in enumerate(data.columns)})

    def orient_directed_graph(self, *args, **kwargs):
        """Orient a (partially directed) graph."""
        return self.predict(*args, **kwargs)
        # undirected? check parent/child

    def orient_undirected_graph(self, *args, **kwargs):
        """Orient a undirected graph."""
        return self.predict(*args, **kwargs)

    def create_graph_from_data(self, *args, **kwargs):
        """Estimate a causal graph out of observational data."""
        return self.predict(*args, **kwargs)
class SAM_generators(Module):
    """
    Args:
        data_shape (tuple): Shape of the true data
        num_hidden: Initial number of hidden units in the hidden layers
    """
    def __init__(self, data_shape, num_hidden, skeleton=None):
        super(SAM_generators, self).__init__()

        num_vars = data_shape[1]
        self.num_vars = num_vars
        # Build fully connected skeleton
        if skeleton is None:
            skeleton = 1 - torch.eye(num_vars + 1, num_vars)
        else:
            skeleton = torch.cat([torch.Tensor(skeleton), torch.ones(1, num_vars)], 1)
        self.register_buffer('skeleton', skeleton)
        # Build layers
        print(num_vars, num_hidden)
        self.input_layer = Linear3D(num_vars, num_vars+1, num_hidden)

        layers = []
        layers.append(ChannelBatchNorm1d(num_vars, num_hidden))
        layers.append(torch.nn.Tanh())
        self.layers = torch.nn.Sequential(*layers)

        self.output_layer = Linear3D(num_vars, num_hidden, 1)

    def forward(self, data, noise, adj_matrix, drawn_neurons=None):
        """
        Args:
            data: True data
            noise: Samples of noise variables
            adj_matrix: Sampled adajacency matrix
            drawn_neurons: Sampled matrix of active neurons
        Returns:
            torch.Tensor: Batch of generated data
        """

        output = self.output_layer(self.layers(self.input_layer(data, noise, adj_matrix * self.skeleton)), drawn_neurons)
        return output.squeeze(2)

    def reset_parameters(self):
        self.output_layer.reset_parameters()
        for layer in self.layers:
            if hasattr(layer, 'reset_parameters'):
                layer.reset_parameters()
        self.input_layer.reset_parameters()

Hi, I have error while using SAM algorithm. I have original data shape (1000 row x 313 column), and batch size is 10, and I am trying to infer data structure using SAM. And error is as above.

The code worked some month ago, so I totally am confused why it complains about batch size(10) should be same to intermediate neural net node number(314 = that's 314 + 1, as i have set..) now.

Tried to 1) transpose data itself and then predict output, but still didn't work, same error message but with different dimension number 2) changed batch size to 314 (I know.. this would never work.)

Can someone help? (Sorry if I might be asking something very basic.)

diviyank commented 2 years ago

Hello,

This is strange, but the implementation of SAM should change soon to fit the new version of the paper. I'll keep you updated! In the meantime, do you have sample data on which the error occurs ?

insookim43 commented 2 years ago

Thanks for reply. I think SAM algorithm itself is okay, but I think other dependency libraries caused the problem. Might be wrong, but I think pytorch version I used caused the issue. Downgrading it to the old version, worked well!

diviyank commented 2 years ago

Okay ! noted, thanks, we should test with newer versions so it works