tvayer / PSCN

A python implementation of Patchy-San Convolutional Network for Graph
42 stars 12 forks source link

reshape error #3

Closed Wangzhen-kris closed 5 years ago

Wangzhen-kris commented 5 years ago

Hi, Thanks for your great work! And I set :PSCN(w=25,k=3,epochs=10,batch_size=32,verbose=2,attr_dim=100) but I got an issue as follow:

--> 157             train.append(np.array(result).reshape(self.k*self.w,self.attr_dim))
    158         X_preprocessed=np.array(train)
    159         end=time.time()

ValueError: cannot reshape array of size 7400 into shape (75,100)

I tried to solve this problem, but it was unsuccessful. Could you help me?

tvayer commented 5 years ago

Hello, can I have more details : which dataset are you using ? How do you load the data ?

Wangzhen-kris commented 5 years ago

Thanks for your reply! And sorry for not replying in time! I use my own data set, which includes the adjacency matrix of graph and the feature matrix of node. Then I use your code to process the data in a suitable format. But my data has the following drawbacks: not all nodes of a graph are connected to each other. For example, a graph has 30 nodes, 25 of which are connected to each other, but the remaining 5 nodes are not connected to the 25 nodes.So I wonder if the problem is related to this?

tvayer commented 5 years ago

Ok, I don't think this is the problem for I already used for such graphs (ex protein/enzymes dataset), I further investigate and come back to you

tvayer commented 5 years ago

I think I found the error : the dimension of the vector attributes of your graphs is not fixed, some are 100 dimensional others are not. For PSCN to work you must have the same dimension for all your attributes. I pushed a new version where an error is raised when this problem occurs, so if you get a BadAttriDimError then some of your attributes are not 100 dimensional.

Moreover you should also define a "dummy_value" that is also a 100 dimensional vector (in the original paper this is a trick when the receptive field size is higher than the size of some graph, maybe this is the case for you..). For example by setting the option dummy_value=np.repeat(0,100) in PSCN. I added an example with the BZR dataset in train_example.

Tell me if this fixed your error

Wangzhen-kris commented 5 years ago

Thank you for your help. Sorry... I‘ve set self.dummy_value=[-1] * attr_dim in file pscn.py , but the problem remains. And I added a list variable named result in def process_data_test(self,X,y=None):

..........
..............
.................
self.times_process_details['labeling_procedure'].append(np.sum(rfMaker.all_times['labeling_procedure']))
self.times_process_details['first_labeling_procedure'].append(np.sum(rfMaker.all_times['first_labeling_procedure']))
result = []
for x in forcnn:
     for y in x:
          for z in y:
                result.append(z)

The correct length of x, y, z should be 25, 3, 100. But sometimes the length value of x is 2.This is also the cause of the problem, but I don't know how to modify the code.

tvayer commented 5 years ago

Hello

Did you check that all your node features are 100 dimensional ? Did you tried to run the code with the last version of yesterday, what error do you get ?

Also dummy_value should be set when instantiate PSCN not inside pscn.py (see example in notebook)

Wangzhen-kris commented 5 years ago

Yes, I'm sure all my node features are 100 dimensional. Then I will run your newly released code and come back to you. Thank you very much!

Wangzhen-kris commented 5 years ago

You're right. I found my mistake through your latest code. I apologize for my behavior of drawing conclusions without running the code. And thank you again for your help!