Open VaghehDashti opened 11 months ago
Hi @hosseinfani, Here's an update regarding our first temporal negative sampling strategy: Just a reminder that in this first method when training the model on year [t], we are going to use the unigram distribution of year [t - k] to generate negative samples.
The first results are for k=1 and they are not good unfortunately. They all have a much lower performance on all metrics. I will run on k=5 and k=10 to see if it gets better or worse. Also, the github results are not complete due to some minor issues, I will update them asap. dblp:
imdb:
uspt:
gith:
Hi @hosseinfani, Here is the final results of experiments on the first temporal negative sampling strategy. As can be seen, unfortunately, the first temporal ns strategy decreases performance with different ks [1,5,10]. I have implemented the second temporal ns, where instead of just using year [t-k] to compute the unigram, we use data from all year up to year [t-k]. I will be running the experiments on k=[1,5,10] and update here when they are ready.
dblp:
imdb:
uspt:
gith:
@VaghehDashti I think there is issues in the implementation. k=10 means that you select an expert from 10 years ago, which should be definitely a negative sample.
Also, when each year in the past does not help, not sure the union of all past years help.
Please create a toy example, put two disjoint set of experts in (a) 2010 and (b)2020 and 2021 years, then. train the model on 2020, picking negative samples from 2010. then predict for 2021. or something like this that shows the code is fine or reveal the bug in our logic of thinking.
This issue will be for various temporal negative sampling strategies. The first temporal negative sampling that I will be working on is as follows: When training the neural models on year t, we are going to use the unigram distribution of year t - k to generate negative samples. Also for the first k years, the model will not utilize negative sampling in training.