muhanzhang / pytorch_DGCNN

PyTorch implementation of DGCNN
MIT License
369 stars 122 forks source link

Some errors when processing DGCNN #33

Closed AllenWu18 closed 3 years ago

AllenWu18 commented 4 years ago

Hi Muhan, can you help me deal with two errors? When I tried to follow the Readme to make -j4 under the "lib" it shows like following: Nothing to be done for `all'.

And when I make the command "./run_DGCNN.sh", it feedbacked the following error: Traceback (most recent call last): File "main.py", line 14, in from DGCNN_embedding import DGCNN File "/Users/jishilun/Desktop/DGCNN_official/DGCNN_embedding.py", line 17, in from gnn_lib import GNNLIB File "/Users/jishilun/Desktop/DGCNN_official/lib/gnn_lib.py", line 87, in GNNLIB = _gnn_lib(sys.argv) File "/Users/jishilun/Desktop/DGCNN_official/lib/gnn_lib.py", line 12, in init self.lib = ctypes.CDLL('%s/build/dll/libgnn.so' % dir_path) File "/anaconda3/lib/python3.7/ctypes/init.py", line 356, in init self._handle = _dlopen(self._name, mode) OSError: dlopen(/Users/jishilun/Desktop/DGCNN_official/lib/build/dll/libgnn.so, 6): no suitable image found. Did find: /Users/jishilun/Desktop/DGCNN_official/lib/build/dll/libgnn.so: unknown file type, first eight bytes: 0x7F 0x45 0x4C 0x46 0x02 0x01 0x01 0x00 /Users/jishilun/Desktop/DGCNN_official/lib/build/dll/libgnn.so: unknown file type, first eight bytes: 0x7F 0x45 0x4C 0x46 0x02 0x01 0x01 0x00

Above all are the errors, which I can not handle. So Can you tell me where the error occurs? Thank you very much!

muhanzhang commented 4 years ago

Seems that you are installing in Windows. Please see https://github.com/muhanzhang/pytorch_DGCNN/issues/27.

AllenWu18 commented 4 years ago

Thank you, but I used the code in Mac OS. :(

muhanzhang commented 4 years ago

I see. Can you try "make clean" and then "make -j4" again? Thanks.

AllenWu18 commented 4 years ago

It works! Thank you :) And may I ask another question? Since I use my Mac to run the code, it does not have any GPU to accelerate the speed, so I just wanna to change the code to another server which have GPU, and can I just copy the code which have been made already to the server or rebuild it?

muhanzhang commented 4 years ago

Yes, another "make clean" and "make -j4" should rebuild it on another machine.

AllenWu18 commented 4 years ago

Thank you very much! It works :)

AllenWu18 commented 4 years ago

OK , and I have another question. Owing to some reason, I have to run the main.py directly, so I just copy some hyper-parameter in run_DGCNN.sh into util.py and make them as following: cmd_opt = argparse.ArgumentParser(description='Argparser for graph_classification') cmd_opt.add_argument('-mode', default='cpu', help='cpu/gpu') cmd_opt.add_argument('-gm', default='DGCNN', help='gnn model to use') cmd_opt.add_argument('-data', default='MUTAG', help='data folder name') cmd_opt.add_argument('-batch_size', type=int, default=50, help='minibatch size') cmd_opt.add_argument('-seed', type=int, default=1, help='seed') cmd_opt.add_argument('-feat_dim', type=int, default=0, help='dimension of discrete node feature (maximum node tag)') cmd_opt.add_argument('-edge_feat_dim', type=int, default=0, help='dimension of edge features') cmd_opt.add_argument('-num_class', type=int, default=0, help='#classes') cmd_opt.add_argument('-fold', type=int, default=1, help='fold (1..10)') cmd_opt.add_argument('-test_number', type=int, default=0, help='if specified, will overwrite -fold and use the last -test_number graphs as testing data') cmd_opt.add_argument('-num_epochs', type=int, default=300, help='number of epochs') cmd_opt.add_argument('-latent_dim', type=str, default='32-32-32-1', help='dimension(s) of latent layers') cmd_opt.add_argument('-sortpooling_k', type=float, default=0.6, help='number of nodes kept after SortPooling') cmd_opt.add_argument('-conv1d_activation', type=str, default='ReLU', help='which nn activation layer to use') cmd_opt.add_argument('-out_dim', type=int, default=1024, help='graph embedding output size') cmd_opt.add_argument('-hidden', type=int, default=128, help='dimension of mlp hidden layer') cmd_opt.add_argument('-max_lv', type=int, default=4, help='max rounds of message passing') cmd_opt.add_argument('-learning_rate', type=float, default=0.0001, help='init learning_rate') cmd_opt.add_argument('-dropout', type=bool, default=True, help='whether add dropout after dense layer') cmd_opt.add_argument('-printAUC', type=bool, default=False, help='whether to print AUC (for binary classification only)') cmd_opt.add_argument('-extract_features', type=bool, default=False, help='whether to extract final graph features') Then I run the main.py. What surprised me is that the code run just 35 seconds and finished. I think maybe the deep learning code should run more time. So is there anything wrong happend? I copy some output log as following: loss: 0.15370 acc: 0.98000: 0%| | 0/3 [00:00<?, ?batch/s] loss: 0.15370 acc: 0.98000: 100%|██████████| 3/3 [00:00<00:00, 32.33batch/s] 0%| | 0/1 [00:00<?, ?batch/s] loss: 0.81523 acc: 0.72222: 0%| | 0/1 [00:00<?, ?batch/s] loss: 0.81523 acc: 0.72222: 100%|██████████| 1/1 [00:00<00:00, 135.93batch/s] 0%| | 0/3 [00:00<?, ?batch/s] loss: 0.11178 acc: 0.96000: 0%| | 0/3 [00:00<?, ?batch/s] loss: 0.15390 acc: 0.96000: 0%| | 0/3 [00:00<?, ?batch/s] loss: 0.15579 acc: 0.94000: 0%| | 0/3 [00:00<?, ?batch/s] loss: 0.15579 acc: 0.94000: 100%|██████████| 3/3 [00:00<00:00, 31.58batch/s] 0%| | 0/1 [00:00<?, ?batch/s] loss: 0.83672 acc: 0.72222: 0%| | 0/1 [00:00<?, ?batch/s] loss: 0.83672 acc: 0.72222: 100%|██████████| 1/1 [00:00<00:00, 134.19batch/s] 0%| | 0/3 [00:00<?, ?batch/s] loss: 0.15884 acc: 0.98000: 0%| | 0/3 [00:00<?, ?batch/s] loss: 0.14649 acc: 0.94000: 0%| | 0/3 [00:00<?, ?batch/s]average training of epoch 292: loss 0.13054 acc 0.96000 auc 0.00000 average test of epoch 292: loss 0.81523 acc 0.72222 auc 0.00000 average training of epoch 293: loss 0.14049 acc 0.95333 auc 0.00000 average test of epoch 293: loss 0.83672 acc 0.72222 auc 0.00000

loss: 0.05917 acc: 1.00000: 0%| | 0/3 [00:00<?, ?batch/s] loss: 0.05917 acc: 1.00000: 100%|██████████| 3/3 [00:00<00:00, 31.86batch/s] 0%| | 0/1 [00:00<?, ?batch/s] loss: 0.79635 acc: 0.72222: 0%| | 0/1 [00:00<?, ?batch/s] loss: 0.79635 acc: 0.72222: 100%|██████████| 1/1 [00:00<00:00, 120.32batch/s] 0%| | 0/3 [00:00<?, ?batch/s] loss: 0.15357 acc: 0.94000: 0%| | 0/3 [00:00<?, ?batch/s] loss: 0.12796 acc: 0.96000: 0%| | 0/3 [00:00<?, ?batch/s] loss: 0.13958 acc: 0.94000: 0%| | 0/3 [00:00<?, ?batch/s] loss: 0.13958 acc: 0.94000: 100%|██████████| 3/3 [00:00<00:00, 31.31batch/s] 0%| | 0/1 [00:00<?, ?batch/s] loss: 0.78103 acc: 0.72222: 0%| | 0/1 [00:00<?, ?batch/s] loss: 0.78103 acc: 0.72222: 100%|██████████| 1/1 [00:00<00:00, 132.03batch/s] 0%| | 0/3 [00:00<?, ?batch/s] loss: 0.15517 acc: 0.94000: 0%| | 0/3 [00:00<?, ?batch/s] loss: 0.10185 acc: 0.98000: 0%| | 0/3 [00:00<?, ?batch/s]average training of epoch 294: loss 0.12150 acc 0.97333 auc 0.00000 average test of epoch 294: loss 0.79635 acc 0.72222 auc 0.00000 average training of epoch 295: loss 0.14037 acc 0.94667 auc 0.00000 average test of epoch 295: loss 0.78103 acc 0.72222 auc 0.00000

loss: 0.12484 acc: 0.96000: 0%| | 0/3 [00:00<?, ?batch/s] loss: 0.12484 acc: 0.96000: 100%|██████████| 3/3 [00:00<00:00, 31.56batch/s] 0%| | 0/1 [00:00<?, ?batch/s] loss: 0.79782 acc: 0.77778: 0%| | 0/1 [00:00<?, ?batch/s] loss: 0.79782 acc: 0.77778: 100%|██████████| 1/1 [00:00<00:00, 127.08batch/s] 0%| | 0/3 [00:00<?, ?batch/s] loss: 0.06225 acc: 1.00000: 0%| | 0/3 [00:00<?, ?batch/s] loss: 0.23531 acc: 0.92000: 0%| | 0/3 [00:00<?, ?batch/s] loss: 0.11926 acc: 0.94000: 0%| | 0/3 [00:00<?, ?batch/s] loss: 0.11926 acc: 0.94000: 100%|██████████| 3/3 [00:00<00:00, 31.07batch/s] 0%| | 0/1 [00:00<?, ?batch/s] loss: 0.83252 acc: 0.77778: 0%| | 0/1 [00:00<?, ?batch/s] loss: 0.83252 acc: 0.77778: 100%|██████████| 1/1 [00:00<00:00, 136.37batch/s] 0%| | 0/3 [00:00<?, ?batch/s] loss: 0.17074 acc: 0.94000: 0%| | 0/3 [00:00<?, ?batch/s] loss: 0.12587 acc: 0.94000: 0%| | 0/3 [00:00<?, ?batch/s]average training of epoch 296: loss 0.12729 acc 0.96000 auc 0.00000 average test of epoch 296: loss 0.79782 acc 0.77778 auc 0.00000 average training of epoch 297: loss 0.13894 acc 0.95333 auc 0.00000 average test of epoch 297: loss 0.83252 acc 0.77778 auc 0.00000

loss: 0.08625 acc: 0.98000: 0%| | 0/3 [00:00<?, ?batch/s] loss: 0.08625 acc: 0.98000: 100%|██████████| 3/3 [00:00<00:00, 30.47batch/s] 0%| | 0/1 [00:00<?, ?batch/s] loss: 0.77952 acc: 0.77778: 0%| | 0/1 [00:00<?, ?batch/s] loss: 0.77952 acc: 0.77778: 100%|██████████| 1/1 [00:00<00:00, 137.40batch/s] 0%| | 0/3 [00:00<?, ?batch/s] loss: 0.12798 acc: 0.94000: 0%| | 0/3 [00:00<?, ?batch/s] loss: 0.21450 acc: 0.92000: 0%| | 0/3 [00:00<?, ?batch/s] loss: 0.05701 acc: 1.00000: 0%| | 0/3 [00:00<?, ?batch/s] loss: 0.05701 acc: 1.00000: 100%|██████████| 3/3 [00:00<00:00, 31.61batch/s] 0%| | 0/1 [00:00<?, ?batch/s] loss: 0.78236 acc: 0.77778: 0%| | 0/1 [00:00<?, ?batch/s] loss: 0.78236 acc: 0.77778: 100%|██████████| 1/1 [00:00<00:00, 139.01batch/s]average training of epoch 298: loss 0.12762 acc 0.95333 auc 0.00000 average test of epoch 298: loss 0.77952 acc 0.77778 auc 0.00000 average training of epoch 299: loss 0.13316 acc 0.95333 auc 0.00000 average test of epoch 299: loss 0.78236 acc 0.77778 auc 0.00000 The total cost is 34.45206904411316

I just do the experiments under dataset MUTAG. Thx!

AllenWu18 commented 4 years ago

Oh I forget to inform that I run the code on a MacBook Pro with only a CPU available. That is what puzzles me.

muhanzhang commented 4 years ago

That's normal. MUTAG is a rather small dataset with only ~170 graphs. And you are only running the with fold 1. You should do cross validation to iterate over the 10 folds to get average accuracy.

AllenWu18 commented 4 years ago

OK, I got it. Thx! :)