运行gcn_reddit_sample.cfg时出现问题

xuziweiwh commented 1 month ago

Hello, regarding the two example configurations mentioned in the README file, gcn_reddit_sample.cfg and gcn_cora_sample.cfg. I can run the latter normally, but the path for the former shows an issue. Should I download this dataset myself from the internet? Will there be graphs when running these examples? I apologize, I just learned this, I have a lot of questions. Looking forward to your reply.🤓🤓🤓 The error is as follows: nts:/home/xxx/desktop/Sample-based-GNN-main/dep/gemini/filesystem.hpp:34: long int file_size(std::string): Assertion failed.stat(filename.c_str(),&st) == 0.

AiX-im commented 1 month ago

Thank you for your interest in our work. The code repository contains only Cora datasets.

NeutronOrch needs 4 dataset files to run:

your_dateset.edge，a binary edge list file, used to store the graph structure.
your_dateset.feat，contains the feature of each node, the first number in each line indicates the node number, followed by the feature of the node.
your_dateset.label，contains the label of each node, the first number in each line indicates the node number, followed by the classification number of the node.
your_dateset.mask, contains the mask of each node, the first number in each line indicates the node number, followed by the mask of node (train, val, test).

We provide a python script to convert some commonly used datasets, please refer to data/generate_nts_dataset.py for details.

If you have any other questions please let us know.

xuziweiwh commented 1 month ago

Thank you for your help. I successfully downloaded several datasets using your method. However, when I checked the contents of the datasets, I found some errors. For example, when I re-downloaded the Cora dataset, an unknown error appeared in its mask file. Normally, the mask file does not have an unknown error. I re-downloaded the dataset several times, and I found that the error persisted. Could this be an error in the generate_nts_dataset.py function? The error message is as follows: 640 unknown 641 unknown......1707 unknown. I look forward to your reply!😊😊😊

Sanzo00 commented 1 month ago

Thank you for your help. I successfully downloaded several datasets using your method. However, when I checked the contents of the datasets, I found some errors. For example, when I re-downloaded the Cora dataset, an unknown error appeared in its mask file. Normally, the mask file does not have an unknown error. I re-downloaded the dataset several times, and I found that the error persisted. Could this be an error in the generate_nts_dataset.py function? The error message is as follows: 640 unknown 641 unknown......1707 unknown. I look forward to your reply!😊😊😊

Hi, this is a normal behavior and does not affect the program’s execution. We use DGL and OGB to download GNN datasets, and convert them into the format required by NeutronOrch using generate_nts_dataset.py.

Not all vertices in the Cora dataset have labels. Please refer to the dataset description here: CoraGraphDataset, (Train: 140, Valid: 500, Test: 1000).

For vertices without labels, we manually mark them as “Unknown”. You can check the specific code here: code link

If you have any further questions, feel free to ask!

xuziweiwh commented 4 weeks ago

Hello, I would like to ask, does the code run successfully as soon as the following content appears? Will there be a comparison chart when the code runs? Looking forward to your reply. 截图 2024-10-30 16-47-32(1)

AiX-im commented 3 weeks ago

Yes, NTO is working fine. More detailed output can be viewed in the log folder. You can adjust the hot vertices computation (i.e., "CACHE_RATE") to reduce CPU computation time and improve performance.

xuziweiwh commented 3 weeks ago

Okay, thank you very much for your help in successfully running multiple datasets. Wishing you a happy life and greater academic achievements!🥳🥳🥳

AiX-im / Sample-based-GNN

运行gcn_reddit_sample.cfg时出现问题 #4