Closed rachithaiyappa closed 1 year ago
There is no way to get around this error. Since the network is extremely sparse, it is impossible to remove the specified fraction of edges while keeping the network stay connected.
Did you see the same error for other networks? You can keep Snakemake going even if errors arise by setting -k option. Let's run the script and find out how many networks raise the error.
I didn't check for many networks. I'm currently running it with the -k option. We'll see how prominent this is once it is done.
Snakemake done. It ran for 93 out of the 98 networks (generated stuff inside datasets
, embedding
,link-prediction
, and results/auc-roc
directories). However, the result_auc_roc.csv
was not updated inside results/auc-roc
Not sure why. Need to check.
I guess snakemake probably didn't execute plot_aucroc
rule because it is separate from the rest and since the rule all
failed in 5 out of the 98 networks because of the fraction parameter issue
So quick! Thank you so much! I don't know why, but we can force it to be updated by removing result_auc_roc.csv
Oh, yeah. That's right. Since not all rules were successfully executed. I think 93/98 networks are fairly good. Let's remove the 5 networks and go with the fraction of 0.5.
Okay. The figures are ready.
It is under derived/results
in the shared data directory. Just skimmed through it and it seems like preferential attachment link pred with biased sampling does underperform uniform sampling always :)
I will work on summarising them later. I have some ideas for that but do feel free to suggest what kind of summary plot you'd like to see.
Cool! I'd like to take a look at it. It seems you need to chmod the files for granting access?
Sorry. You should be able to see it now
Closing this issue. We chose to ignore the few networks where the fraction parameter issue arises.
Having added a bunch of new networks in #3, I was testing the rest of the pipeline.
When processing the file which currently resides under
data/derived/networks/raw/power/edge_table.csv
the rule
generate_link_prediction_dataset:
throws an errorI haven't looked into
LinkPredictionDataset.py
much but any idea?Decreasing fraction parameter is possible but is that what we want to do?