LCS2-IIITD / SANDS

6 stars 3 forks source link

about 'Semi-supervised Stance Detection of Tweets Via Distant Network Supervision' #1

Open gzz0312 opened 2 years ago

gzz0312 commented 2 years ago

Hello, I am very interested in your article titled "Semi-supervised Stance Detection of Tweets Via Distant Network Supervision" published on wsdm-2022, I read your code, and I don't understand something, I hope you can help me Solved, in lines 119, 120, and 121 of your run_model.py file, mask is a matrix with all zeros, and loss_a and loss_b are also all zeros. In this case, these two losses will be useless. I look forward to your reply.

Subha0009 commented 2 years ago

Hi. Thanks a lot for showing interest in our work. There are four loss components: loss_a, loss_b (these are the semi-supervision loss) and loss_c, loss_d (these are the supervised loss). The matrix 'mask' contains the adjacent indices according to the followership. This is highly sparse, but not all zero. 'mask' is simply computed from the edge weight matrix EW (which is being loaded at line 41-42 from disk) selecting the necessary slices and converting them to float32 tensors. Hope this explains the issue. Please do let me know if you need any more clarification :)

gzz0312 commented 2 years ago

image Hello, thanks for the reply, the elements of EW_batch in this line of code are zero and one. When executing tf.greater_equal(tf.reduce_max(EW_batch, axis=-1), model_params['min_degree']), The value of model_params['min_degree'] is 15 or 20, and the value of mask can only be all zeros. According to the idea of ​​your article titled "Semi-supervised Stance Detection of Tweets Via Distant Network Supervision" published on wsdm-2022, should the above code be changed to tf.greater_equal(tf.reduce_sum(EW_batch, axis=-1), model_params[' min_degree']), max is changed to sum, but the effect of this change is very poor.I would be very grateful if you can clarify my confusion. The picture below is the mask value I print when I run the code you provided. image

Subha0009 commented 2 years ago

Hi. Thanks again for pointing this out. I'm going through the main codebase to check the issue. Meanwhile, could you please elaborate on the poor effect of the said change?

gzz0312 commented 2 years ago

Hi, thanks for the reply, when I run the USA dataset with splitsize of 500, the F1-score is only 0.37, which is much lower than the 0.49 in your article.

gzz0312 commented 1 year ago

image hi,I'm very interested in your work, but I don't understand the representation of tweet data. May I ask how the encoded_tweet_list in your code is initialized. This question can also be expressed as, what does the vector of a Twitter mean?

Subha0009 commented 1 year ago

Hi. This encoded_tweet_list is simply a tensor representation of tweets, containing sequences of one-hot vectors representing words in the tweet. So if we have a vocab size of $V$, a single tweet will be represented as $$\tau_i = [v_j|j\in [0, \text{maxlen}), v_j\in [0, V)]$$ Then encoded_tweet_list is a sequence $[\tau_i|i\in [0, \text{batch size})]$

gzz0312 commented 1 year ago

Hi, according to your statement, a word of Twitter is represented by one-hot vector, then a tweet is a matrix, but the data in your code, a tweet is a vector.

Subha0009 commented 1 year ago

A tweet is a vector here since each token is represented as an integer index (position in vocab) and not sparse one-hot vectors.