Closed f-hafner closed 3 weeks ago
moreover, to comply with the deepwalk tool, it's better to have each "epoch" of walks in a separate file. thus, iterate sequentially over the starting nodes, create one walk per node, store the results in a parquet file as they arrive, create the additional walks for edge types. repeat this loop for N_WALKS
times.
to avoid memory issues, stream the result from the walk generator and write to file as they arrive. the goal should be to create up to 100 walks per node.