manhtuando97 / KDD-20-Hypergraph

Code, datasets & supplementary for "Structural Patterns and Generative Models of Real-world Hypergraphs"
31 stars 7 forks source link

Examples.sh doesn't run #2

Open LilithHafner opened 2 years ago

LilithHafner commented 2 years ago

I suspect that the output folders are not checked into GitHub and the examples fail to create them.

Reproduce this error by cloning the repository and following the instructions in user_guide.pdf:

x@X Downloads % git clone https://github.com/manhtuando97/KDD-20-Hypergraph
Cloning into 'KDD-20-Hypergraph'...
remote: Enumerating objects: 214, done.
remote: Counting objects: 100% (52/52), done.
remote: Compressing objects: 100% (50/50), done.
remote: Total 214 (delta 22), reused 0 (delta 0), pack-reused 162
Receiving objects: 100% (214/214), 25.87 MiB | 5.41 MiB/s, done.
Resolving deltas: 100% (63/63), done.
x@X Downloads % cd KDD-20-Hypergraph
x@X KDD-20-Hypergraph % cd Code/Generator 
x@X Generator % ls
SS.py                   hyper_preferential_attachment.py    simplex per node
examples.sh             preferential_attachment.py      size distribution
x@X Generator % ./examples.sh
zsh: permission denied: ./examples.sh
x@X Generator % chmod +x ./examples.sh 
x@X Generator % ./examples.sh         
Traceback (most recent call last):
  File "hyper_preferential_attachment.py", line 441, in <module>
    main()
  File "hyper_preferential_attachment.py", line 434, in main
    generator.generate()
  File "hyper_preferential_attachment.py", line 235, in generate
    f = open(self.output_directory + "/" + file_name + ".txt", "w")
IOError: [Errno 2] No such file or directory: 'output_directory\r/DAWN.txt'
: command not found 2: 
Traceback (most recent call last):
  File "preferential_attachment.py", line 245, in <module>
    main()
  File "preferential_attachment.py", line 238, in main
    generator.generate()
  File "preferential_attachment.py", line 133, in generate
    f = open(self.output_directory + "/" + file_name + ".txt", "w")
IOError: [Errno 2] No such file or directory: 'output_directory\r/DAWN.txt'
: command not found 4: 
Traceback (most recent call last):
  File "SS.py", line 295, in <module>
    main()
  File "SS.py", line 290, in main
    generator.generate()
  File "SS.py", line 72, in generate
    f = open(CRU_file, "w")
IOError: [Errno 2] No such file or directory: 'output_directory/DAWN.txt'
x@X Generator % python hyper_preferential_attachment.py --name=DAWN --file_name=DAWN --num_nodes=3029 --simplex_per_node_directory='simplex per node' --size_distribution_directory='size distribution' --output_directory=output_directory
Traceback (most recent call last):
  File "hyper_preferential_attachment.py", line 441, in <module>
    main()
  File "hyper_preferential_attachment.py", line 434, in main
    generator.generate()
  File "hyper_preferential_attachment.py", line 235, in generate
    f = open(self.output_directory + "/" + file_name + ".txt", "w")
IOError: [Errno 2] No such file or directory: 'output_directory/DAWN.txt'
x@X Generator % 
LilithHafner commented 2 years ago

I'm on a macOS 11.4

manhtuando97 commented 2 years ago

Please create your own desired output directory and specify it in the argument: --output_directory. 'output_directory' was set as the default directory for the output files, but users need to create their own output directories and specify in the command lines.

LilithHafner commented 2 years ago

Thanks for the speedy reply! I did that, ran into another error,

Traceback (most recent call last):
  File "hyper_preferential_attachment.py", line 442, in <module>
    main()
  File "hyper_preferential_attachment.py", line 435, in main
    generator.generate()
  File "hyper_preferential_attachment.py", line 267, in generate
    sampled_number = np.random.choice(a = maximum_number + 1, size = 1, replace = False, p = distribution)
  File "mtrand.pyx", line 1023, in mtrand.RandomState.choice (numpy/random/mtrand/mtrand.c:7507)
ValueError: probabilities do not sum to 1

And then changed python to python3 which fixed that error (on my machine python points to python 2 while python3 points to python 3 to preserve backward compatibility). Now it runs, but its been thinking hard for quite a while without terminal output. Any idea how long its supposed to take to run python3 hyper_preferential_attachment.py --name=DAWN --file_name=DAWN --num_nodes=3029 --simplex_per_node_directory='simplex per node' --size_distribution_directory='size distribution' --output_directory=output_directory?

Perhaps the need to manually create an output directory should be in the user guide?

LilithHafner commented 2 years ago

Do you happen to have the theoretical asymptotic runtime of your algorithm worked out anywhere? Glancing it over it looks like it could be O(sum(2^len(edge) for edge in edges)) ~ O(edge_count*2^edge_size) dominated by line 15 in the paper defined at line 210 in the implementation.

LilithHafner commented 2 years ago

By periodically checking the size of the output file, it looks like the runtime is about 1 microsecond times the number of edges generated squared. Perhaps that is the asymptotic runtime for small hyper edge size and large edge count?

LilithHafner commented 2 years ago

Any idea how long its supposed to take to run python3 hyper_preferential_attachment.py --name=DAWN --file_name=DAWN --num_nodes=3029 --simplex_per_node_directory='simplex per node' --size_distribution_directory='size distribution' --output_directory=output_directory?

Update, it ran successfully in 7 hours (7:14:08) on my 2019 MacBook Air. If this is expected behavior, and executing examples.sh without manually creating an output_directory is supposed to error, then aside from some documentation issues (letting folks know about runtime & output_directory), there don't seem any strict bugs in this issue thread.

If you'd like to help folks avoid a file-not-found error, I made a fairly non-invasive PR #5 that should help.