Closed hitxujian closed 5 years ago
Hi xujian, nice question!
For segment_ids, 0 stands for query and 1 stands for doc, but after doc tokens in input_ids, it's padding ids with 0, 0 or 1 is meaningless to segment_ids for these tokens,so I think it's equal to use 0 or 1 for index in segment_ids after doc tokens, which is padding ids.
I did one experiment for this:
In run_squad.py, line 310, segment_ids.append(0)
, the output of logits in modeling.py line 458 is as followings:
tensor([[[-0.0041, 0.0068],
[-0.0047, 0.0048],
[-0.0057, 0.0045],
...,
[-0.0048, 0.0031],
[-0.0036, 0.0056],
[-0.0039, 0.0028]],
[[-0.0042, 0.0058],
[-0.0056, 0.0084],
[-0.0066, 0.0066],
...,
[-0.0061, 0.0036],
[-0.0064, 0.0031],
[-0.0075, 0.0061]],
[[-0.0040, 0.0050],
[-0.0052, 0.0052],
[-0.0049, 0.0027],
...,
[-0.0049, 0.0010],
[-0.0003, 0.0049],
[-0.0062, 0.0016]]], grad_fn=<ThAddBackward>)
And if you change the code tosegment_ids.append(1)
, we will get the same answer:
tensor([[[-0.0041, 0.0068],
[-0.0047, 0.0048],
[-0.0057, 0.0045],
...,
[-0.0048, 0.0031],
[-0.0036, 0.0056],
[-0.0039, 0.0028]],
[[-0.0042, 0.0058],
[-0.0056, 0.0084],
[-0.0066, 0.0066],
...,
[-0.0061, 0.0036],
[-0.0064, 0.0031],
[-0.0075, 0.0061]],
[[-0.0040, 0.0050],
[-0.0052, 0.0052],
[-0.0049, 0.0027],
...,
[-0.0049, 0.0010],
[-0.0003, 0.0049],
[-0.0062, 0.0016]]], grad_fn=<ThAddBackward>)
So, it's the same for segment_id to store 0 or 1, because the data it corresponds to is padding ids.
Please check and let me know if there are any questions. Thanks.
hi: thanks for sharing! one problem about segment_ids array. while len(input_ids) < max_seq_length: input_ids.append(0) input_mask.append(0) segment_ids.append(0) in segment_ids array,1 indicates token from passage and 0 indicate token form query. when padding,why segment_ids filled with 0,which represents query