dmlc / dgl

Python package built to ease deep learning on graph, on top of existing DL frameworks.
http://dgl.ai
Apache License 2.0
13.55k stars 3.02k forks source link

[DistGB] `dispatch_data.py` with `--use-graphbolt` crashed #7816

Open Rhett-Ying opened 1 month ago

Rhett-Ying commented 1 month ago

🐛 Bug

To Reproduce

Steps to reproduce the behavior:

Warning: Permanently added '[127.0.0.1]:2222' (ED25519) to the list of known hosts.
[5ea156f179a8 INFO 2024-10-03 23:29:47,836 PID:2471] [Rank: 0] Done with process group initialization...
[5ea156f179a8 INFO 2024-10-03 23:29:47,837 PID:2471] [Rank: 0] Starting distributed data processing pipeline...
[5ea156f179a8 INFO 2024-10-03 23:29:47,841 PID:2471] [Rank: 0] Initialized metis partitions and node_types map...
[5ea156f179a8 INFO 2024-10-03 23:29:47,862 PID:2471] [Rank: 0] Done reading dataset /data/ml-100k
[5ea156f179a8 INFO 2024-10-03 23:29:47,862 PID:2471] [Rank: 0] Done augmenting file input data with auxilary columns
[5ea156f179a8 INFO 2024-10-03 23:29:47,895 PID:2471] [Rank: 0] Total time for feature exchange: 0:00:00.032553
[5ea156f179a8 INFO 2024-10-03 23:29:47,895 PID:2471] [Rank: 0] Total time for feature exchange: 0:00:00.000001
[5ea156f179a8 INFO 2024-10-03 23:29:47,931 PID:2471] [Rank: 0] Time to send/rcv edge data: 0:00:00.032726
[5ea156f179a8 INFO 2024-10-03 23:29:48,497 PID:2471] There are 48071 edges in partition 0
[Rank: 0 Edge data is already sorted !!!
[rank0]: Traceback (most recent call last):
[rank0]:   File "/root/dgl/tools/distpartitioning/data_proc_pipeline.py", line 134, in <module>
[rank0]:     multi_machine_run(params)
[rank0]:   File "/root/dgl/tools/distpartitioning/data_shuffle.py", line 1499, in multi_machine_run
[rank0]:     gen_dist_partitions(rank, params.world_size, params)
[rank0]:   File "/root/dgl/tools/distpartitioning/data_shuffle.py", line 1327, in gen_dist_partitions
[rank0]:     ) = create_graph_object(
[rank0]:   File "/root/dgl/tools/distpartitioning/convert_partition.py", line 714, in create_graph_object
[rank0]:     indptr, indices, csc_edge_ids = _process_partition_gb(
[rank0]:   File "/root/dgl/tools/distpartitioning/convert_partition.py", line 355, in _process_partition_gb
[rank0]:     return indptr, indices[sorted_idx], edge_ids[sorted_idx]
[rank0]: UnboundLocalError: local variable 'sorted_idx' referenced before assignment

Expected behavior

Environment

Additional context

github-actions[bot] commented 2 weeks ago

This issue has been automatically marked as stale due to lack of activity. It will be closed if no further activity occurs. Thank you