epfl-lts2 / pygsp

Graph Signal Processing in Python
https://pygsp.rtfd.io
BSD 3-Clause "New" or "Revised" License
488 stars 93 forks source link

Problem With Interpolate #61

Closed codenameAggie closed 5 years ago

codenameAggie commented 5 years ago

Howdy!

I am trying to use the reduction interpolate function, and I encounter the following:

    171     
    172     L_reg = G.L + reg_eps * sparse.eye(G.N)
--> 173     K_reg = getattr(G.mr, 'K_reg', kron_reduction(L_reg, keep_inds))
    174     green_kernel = getattr(G.mr, 'green_kernel',
    175                            filters.Filter(G, lambda x: 1. / (reg_eps + x)))

AttributeError: 'Graph' object has no attribute 'mr

The following is how I am using it; my graph is fully connected, and oneSignal and indeces are one full signal on the graph in np array format and the indices in np array format respectively.

pygsp.reduction.interpolate(graph, oneSignal, indeces)

Any suggestions?

nperraud commented 5 years ago

Hey, Could you make a small example that we could use to reproduce the error? That would help us to understand the problem Best

codenameAggie commented 5 years ago

Howdy!

A little background:

I have cleaned a traffic dataset, and I have reduced down raw data into a graph, where I have it represented as an adj matrix (NxN), and a traffic signal, which I have in an DxN matrix (D >> N); each graph signal is a 1xN array, where at each index, the signal on the corresponding vertex is defined.

I want to impute say, half of the datapoints (graph signals) that I am missing; how can I use this method to achieve this? Say, I want to impute half of the "missing" datapoints (graph signals) for a given time-step. How would I call this function? Is there an example that you can provide?

Here's how I am doing it rn:

from pygsp import reduction, graphs, filters
import pandas as pd
import numpy as np
import random

import networkx as nx
import matplotlib.pyplot as plt
random.seed(0)

# I read the data in, adj is NxN, and signal is DxN where at each row, I have the graph signal values for each vertex. Graph is connected, and undirected.

adj = pd.read_csv('Oct2019_10min_450_Jan_Feb_W.csv', header=None)
signal = pd.read_csv('Oct2019_10min_450_Jan_Feb_V.csv')

# I define a mask for train test purposes, not for this method, but, ideally, I want to get a sense of what the RMSE would be on the train set

mask = np.random.rand(signal.shape[0], signal.shape[1]) > 0.5

signal_test = ~mask * signal
signal_train = mask * signal

# I define my graph
graph = graphs.Graph(adj.values)

# one signal (at just one timestep) - results in a narray 1xN shape
oneSignal = np.array(signal.values[4, :])

# getting the same timestep's mask - returns the indices where the samples exist. The rest is to be imputed (right?)
maskIndex = np.where(mask[0, :])[0]

graph.compute_fourier_basis()
graph.set_coordinates()
reduction.graph_multiresolution(graph, 1)
results = reduction.interpolate(graph, oneSignal, maskIndex, order=1000)

upon running this, I get the following:

"""
----> 1 results = reduction.interpolate(graph, oneSignal, maskIndex, order=1000)

c:\users\arash\anaconda3\envs\tf-gpu\lib\site-packages\pygsp\reduction.py in interpolate(G, f_subsampled, keep_inds, order, reg_eps, **kwargs)
    175                            filters.Filter(G, lambda x: 1. / (reg_eps + x)))
    176 
--> 177     alpha = K_reg.dot(f_subsampled)
    178 
    179     try:

c:\users\arash\anaconda3\envs\tf-gpu\lib\site-packages\scipy\sparse\base.py in dot(self, other)
    362 
    363         """
--> 364         return self * other
    365 
    366     def power(self, n, dtype=None):

c:\users\arash\anaconda3\envs\tf-gpu\lib\site-packages\scipy\sparse\base.py in __mul__(self, other)
    498             # dense row or column vector
    499             if other.shape != (N,) and other.shape != (N, 1):
--> 500                 raise ValueError('dimension mismatch')
    501 
    502             result = self._mul_vector(np.ravel(other))

ValueError: dimension mismatch

"""

The resulting masked indices is 233, where the total N (vertex count) is 450. What do I do?

Is there an example usage?

Thank you, Arash

nperraud commented 5 years ago

Hi Arash,

I think what the best for you would be to use some basic interpolator based on a smoothness assumption.

We have function that will just do the trick for you (in master but for the next release). Here is an example.

from pygsp import graphs, learning, filters
import matplotlib.pyplot as plt
import numpy as np

G = graphs.Sensor(seed=42)
G.estimate_lmax()

# Create a ground truth signal:
g = filters.Heat(G, 10)
signal = g.filter(np.random.randn(G.n_vertices))

# Construct a measurement signal from a binary mask:
rs = np.random.RandomState(42)
mask = rs.uniform(0, 1, G.n_vertices) > 0.5
measures = signal.copy()
measures[~mask] = np.nan

# Solve the classification problem by reconstructing the signal:
recovery = learning.regression_tikhonov(
    G, measures, mask, tau=0)

# Plot the results.
# Note that we recover the class with ``np.argmax(recovery, axis=1)``.
fig, ax = plt.subplots(1, 3, sharey=True, figsize=(10, 3))
G.plot_signal(signal, ax=ax[0], title='Ground truth')
G.plot_signal(measures, ax=ax[1], title='Measurements')
G.plot_signal(recovery, ax=ax[2], title='Recovered class')
fig.tight_layout()

taken from https://github.com/epfl-lts2/pygsp/blob/bf4e4374fe925c6d5944b1d6016b812b8f4d3915/pygsp/learning.py#L254

Here is what you should get: image

Good luck

nperraud commented 5 years ago

Sorry, the previous version that I posted was wrong. It should be correct now...

codenameAggie commented 5 years ago

Howdy!

Thank you sooo much; this is exactly what I needed! I was also using the wrong version of pyGSP, as I was using the 0.5.1, using pip install. I changed to the github version and I have things figured out now.

Thanks again, Arash

codenameAggie commented 5 years ago

This issue is resolved now;

thank you, Arash