jwcalder / GraphLearning

Python package for graph-based clustering and semi-supervised learning
MIT License
85 stars 26 forks source link

RuntimeError: The expanded size of the tensor (1) must match the existing size (10) at non-singleton dimension 1. Target sizes: [500, 1]. Tensor sizes: [500, 10] #4

Closed teodorf-bit closed 1 year ago

teodorf-bit commented 1 year ago

WHen I install my own dataset I get the following error

RuntimeError: The expanded size of the tensor (1) must match the existing size (10) at non-singleton dimension 1. Target sizes: [500, 1]. Tensor sizes: [500, 10]

jwcalder commented 1 year ago

Can you provide a simple example of code that reproduces this?

teodorf-bit commented 1 year ago

Okay so I tried different things but I got this error

Downloading https://github.com/jwcalder/GraphLearning/raw/master/Data/MNIST_labels.npz to /graphlearning/fashionmnist/data/mnist_labels.npz... Downloading https://github.com/jwcalder/GraphLearning/raw/master/kNNData/MNIST_raw.npz to /graphlearning/fashionmnist/knn_data/mnist_raw.npz... /home/ubuntu/.local/lib/python3.10/site-packages/graphlearning/ssl.py:205: RuntimeWarning: overflow encountered in divide self.weights = self.weights/self.weights[0] /home/ubuntu/.local/lib/python3.10/site-packages/graphlearning/ssl.py:262: RuntimeWarning: invalid value encountered in multiply pred_labels = np.argmax(scores*w,axis=1) Traceback (most recent call last): File "/graphlearning/fashionmnist/fashionmnist.py", line 44, in pred_labels = model.fit_predict(train_ind,train_labels,all_labels=labels) File "/home/ubuntu/.local/lib/python3.10/site-packages/graphlearning/ssl.py", line 289, in fit_predict self.fit(train_ind, train_labels, all_labels=all_labels) File "/home/ubuntu/.local/lib/python3.10/site-packages/graphlearning/ssl.py", line 479, in fit self.volume_label_projection() File "/home/ubuntu/.local/lib/python3.10/site-packages/graphlearning/ssl.py", line 202, in volume_label_projection grad = class_size - self.class_priors ValueError: operands could not be broadcast together with shapes (3,) (10,)

when I ran

import graphlearning as gl import pandas as pd import csv import numpy as np import time as t

total_labels = 5400 algorithms = ["amle",

"centered_kernel",

          #"dynamic_label_propagation", #  # This cannot be used on large datasets
          #"modularity_mbo",# Test this one later.
          #"peikonal",
          #"poisson_mbo",
          #"randomwalk",
          #"sparse_label_propagation",
          #"graph_nearest_neighbor",
          #"laplace",
          #"plaplace",
          #"laplace_wnll",
          #"laplace_poisson",
          #"poisson",
          #"volume_mbo"
         ]

ratios = [0.1, 0.25, 0.5, 0.75, 0.9] datasets = ["mnist"]

"""

for dataset in datasets: labels = gl.datasets.load(datasets[0], labels_only=True) W = gl.weightmatrix.knn(datasets[0], 10, metric='raw') D = gl.weightmatrix.knn(datasets[0], 10, metric='raw', kernel='distance') for algorithm in algorithms: for seed in [0,1,2,3,4,5,6,7,8,9]: for ratio in ratios: num_train_per_class = ratio*total_labels train_ind = gl.trainsets.generate(labels, rate=round(num_train_per_class)) train_labels = labels[train_ind] class_priors = gl.utils.class_priors(labels) if algorithm == "amle": start = t.time() model = gl.ssl.amle(W, class_priors=class_priors) pred_labels = model.fit_predict(train_ind,train_labels,all_labels=labels) accuracy = gl.ssl.ssl_accuracy(labels,pred_labels,len(train_ind)) time = t.time() - start elif algorithm == "centered_kernel": start = t.time() model = gl.ssl.centered_kernel(W, class_priors=class_priors) pred_labels = model.fit_predict(train_ind,train_labels,all_labels=labels) accuracy = gl.ssl.ssl_accuracy(labels,pred_labels,len(train_ind)) time = t.time() - start elif algorithm == "dynamic_label_propagation": start = t.time() model = gl.ssl.dynamic_label_propagation(W, class_priors=class_priors) pred_labels = model.fit_predict(train_ind,train_labels,all_labels=labels) accuracy = gl.ssl.ssl_accuracy(labels,pred_labels,len(train_ind)) time = t.time() - start elif algorithm == "modularity_mbo": start = t.time() model = gl.ssl.modularity_mbo(W, class_priors=class_priors) pred_labels = model.fit_predict(train_ind,train_labels,all_labels=labels) accuracy = gl.ssl.ssl_accuracy(labels,pred_labels,len(train_ind)) time = t.time() - start elif algorithm == "peikonal": start = t.time() model = gl.ssl.peikonal(W, class_priors=class_priors) pred_labels = model.fit_predict(train_ind,train_labels,all_labels=labels) accuracy = gl.ssl.ssl_accuracy(labels,pred_labels,len(train_ind)) time = t.time() - start elif algorithm == "poisson_mbo": start = t.time() gl.ssl.poisson_mbo(W, class_priors=class_priors, use_cuda=True) pred_labels = model.fit_predict(train_ind,train_labels,all_labels=labels) accuracy = gl.ssl.ssl_accuracy(labels,pred_labels,len(train_ind)) time = t.time() - start elif algorithm == "randomwalk": start = t.time() model = gl.ssl.randomwalk(W, class_priors=class_priors) pred_labels = model.fit_predict(train_ind,train_labels,all_labels=labels) accuracy = gl.ssl.ssl_accuracy(labels,pred_labels,len(train_ind)) time = t.time() - start elif algorithm == "sparse_label_propagation": start = t.time() model = gl.ssl.sparse_label_propagation(W, class_priors=class_priors) pred_labels = model.fit_predict(train_ind,train_labels,all_labels=labels) accuracy = gl.ssl.ssl_accuracy(labels,pred_labels,len(train_ind)) time = t.time() - start elif algorithm == "graph_nearest_neighbor": start = t.time() model = gl.ssl.graph_nearest_neighbor(D, class_priors=class_priors) pred_labels = model.fit_predict(train_ind,train_labels,all_labels=labels) accuracy = gl.ssl.ssl_accuracy(labels,pred_labels,len(train_ind)) time = t.time() - start elif algorithm == "laplace": start = t.time() model = gl.ssl.laplace(W, class_priors=class_priors) pred_labels = model.fit_predict(train_ind,train_labels,all_labels=labels) accuracy = gl.ssl.ssl_accuracy(labels,pred_labels,len(train_ind)) time = t.time() - start elif algorithm == "plaplace": start = t.time() model = gl.ssl.plaplace(W, class_priors=class_priors) pred_labels = model.fit_predict(train_ind,train_labels,all_labels=labels) accuracy = gl.ssl.ssl_accuracy(labels,pred_labels,len(train_ind)) time = t.time() - start elif algorithm == "laplace_wnll": start = t.time() model = gl.ssl.laplace(W, reweighting='wnll', class_priors=class_priors) pred_labels = model.fit_predict(train_ind,train_labels,all_labels=labels) accuracy = gl.ssl.ssl_accuracy(labels,pred_labels,len(train_ind)) time = t.time() - start elif algorithm == "laplace_poisson": start = t.time() model = gl.ssl.laplace(W, reweighting='poisson', class_priors=class_priors) pred_labels = model.fit_predict(train_ind,train_labels,all_labels=labels) accuracy = gl.ssl.ssl_accuracy(labels,pred_labels,len(train_ind)) time = t.time() - start elif algorithm == "poisson": start = t.time() model = gl.ssl.poisson(W, solver='gradient_descent', class_priors=class_priors) pred_labels = model.fit_predict(train_ind,train_labels,all_labels=labels) accuracy = gl.ssl.ssl_accuracy(labels,pred_labels,len(train_ind)) time = t.time() - start elif algorithm == "volume_mbo": start = t.time() model = gl.ssl.volume_mbo(W, class_priors=class_priors) pred_labels = model.fit_predict(train_ind,train_labels,all_labels=labels) accuracy = gl.ssl.ssl_accuracy(labels,pred_labels,len(train_ind)) time = t.time() - start else: print(algorithm) print("No Algorithm with that name")

jwcalder commented 1 year ago

Your code is very long. If you can reduce it to just a few lines that generates the same error, then I'd be happy to take a look. The error message is strange, since there are not that many places where torch tensors are used in the package. It could be the use_cuda=True in mbo. Try setting use_cuda=False and see if you still get the error.

jwcalder commented 1 year ago

If you are still interested in resolving this, please send a short bit of code that reproduces the error. Otherwise I'll close.

jwcalder commented 1 year ago

I'm closing this, but feel free to open another issue with a more concise description of the issue.