Closed fcc357 closed 3 years ago
Hi,
We downsample the dataset linearly. Basically, for any fact, there’s a probability of p to drop it from the kB.
Please let me know if you have any question.
Thanks, Haitian
On May 3, 2020, at 12:02 AM, fcc357 notifications@github.com wrote:
Hello, Dr. Sun I would like to ask you how the data set is downsampled.I want to sample KB tuples down to 10%, 30%, 50%, 70%, 90% to simulate incomplete KB.But I don't know how to do a downsampling right now.We look forward to hearing from you.
— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub, or unsubscribe.
Thank you for your answer. Could you please send me the script of downsample? Thanks
Sorry we don’t have it right now.
You can do:
with open(in_filename) as f_in, open(out_filename) as f_out: for line in f_in: If random.random() < p: f_out.write(line)
Hope this help.
Thanks, Haitian
On May 3, 2020, at 2:20 AM, fcc357 notifications@github.com wrote:
Thank you for your answer. Could you please send me the script of downsample? Thanks
— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/OceanskySun/GraftNet/issues/13#issuecomment-623061322, or unsubscribe https://github.com/notifications/unsubscribe-auth/ADE5XL4AQMY4EWBFVALQ5DTRPUEK5ANCNFSM4MX6RDWQ.
OK,Thank you for you help.
Hello, Dr. Sun I would like to ask you how to get other embedding files and txt files. Now,I have already generated webqsp_subgraphs.json file. And the file can be split into the test.json, dev.json and train.json. But I don't know how to get other embedding files and txt files right now. We look forward to hearing from you.
Hi,
The _emb_100d files are generated from glove 100d embeddings. _kge_100d files are pretrained TransE graph embeddings. These are helpful for rare entities. I don’t think they are useful for all entities because GCN layers will end up getting contextualized embeddings that are sufficient for prediction.
Thanks, Haitian
On May 6, 2020, at 8:36 AM, fcc357 notifications@github.com wrote:
Hello, Dr. Sun I would like to ask you how to get other embedding files and txt files. Now,I have already generated webqsp_subgraphs.json file. And the file can be split into the test.json, dev.json and train.json. But I don't know how to get other embedding files and txt files right now. We look forward to hearing from you.
— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/OceanskySun/GraftNet/issues/13#issuecomment-624623075, or unsubscribe https://github.com/notifications/unsubscribe-auth/ADE5XL436VJREFFJ22KMT2DRQFKUDANCNFSM4MX6RDWQ.
Hello, Dr. Sun I would like to ask you how the data set is downsampled.I want to sample KB tuples down to 10%, 30%, 50%, 70%, 90% to simulate incomplete KB.But I don't know how to do a downsampling right now.We look forward to hearing from you.