Open Xiong-can opened 10 months ago
确实是会这样的。你看一下bad ending rate,大概多大。如果你用的这个库训练的话,这个rate不会特别大。原因是cider metric的问题。如果你比如说加一个bad ending penalty reward,应该能alleviate这个问题我觉得。Ruotian LuoOn Dec 18, 2023, at 9:26 PM, Xiong-can @.***> wrote: After reinforcement learning, the description will be incomplete such as: a motorcycle parked in a parking lot with a ..
—Reply to this email directly, view it on GitHub, or unsubscribe.You are receiving this because you are subscribed to this thread.Message ID: @.***>
确实是会这样的。你看一下bad ending rate,大概多大。如果你用的这个库训练的话,这个rate不会特别大。原因是cider metric的问题。如果你比如说加一个bad ending penalty reward,应该能alleviate这个问题我觉得。Ruotian LuoOn Dec 18, 2023, at 9:26 PM, Xiong-can @.> wrote: After reinforcement learning, the description will be incomplete such as: a motorcycle parked in a parking lot with a .. —Reply to this email directly, view it on GitHub, or unsubscribe.You are receiving this because you are subscribed to this thread.Message ID: @.>
bad ending rate大概是1/2,如何添加bad endiing penalty reward才能alleviate这个问题呢?能请我一下您的模型的这个bad ending rate为多少吗?谢谢!
这个不太对。scst原本的paper会在算cider的时候加上eos token(原本的cider不加)。我之前试的话如果不加eos会有1/3的bad ending。你有follow这样的cider计算方式吗?Ruotian LuoOn Dec 18, 2023, at 10:15 PM, Xiong-can @.***> wrote:
确实是会这样的。你看一下bad ending rate,大概多大。如果你用的这个库训练的话,这个rate不会特别大。原因是cider metric的问题。如果你比如说加一个bad ending penalty reward,应该能alleviate这个问题我觉得。Ruotian LuoOn Dec 18, 2023, at 9:26 PM, Xiong-can @.> wrote: After reinforcement learning, the description will be incomplete such as: a motorcycle parked in a parking lot with a .. —Reply to this email directly, view it on GitHub, or unsubscribe.You are receiving this because you are subscribed to this thread.Message ID: @.>
bad ending rate大概是1/2,如何添加bad endiing penalty reward才能alleviate这个问题呢?能请我一下您的模型的这个bad ending rate为多少吗?谢谢!
—Reply to this email directly, view it on GitHub, or unsubscribe.You are receiving this because you commented.Message ID: @.***>
原本的cider不加)。我之前试的话如果不加eos会有1/3的bad ending。你有follow这样的cider计算方式吗?
我的cider计算方式是M2(meshed memory)论文中的evaluation方法计算cider,强化学习计算分数的时候应该是添加了eos进行计算的,这个是我的强化学习部分主要的代码
caps_gen = text_field.decode(out.view(-1, seq_len))
caps_gt = list(itertools.chain(*([c, ] * beam_size for c in data["text"])))
caps_gen, caps_gt = tokenizer_pool.map(evaluation.PTBTokenizer.tokenize, [caps_gen, caps_gt])
reward = cider.compute_score(caps_gt, caps_gen)[1].astype(np.float32)
reward = torch.from_numpy(reward).to(device).view(detections.shape[0], beam_size)
reward_baseline = torch.mean(reward, -1, keepdim=True)
loss = -torch.mean(log_prob, -1) * (reward - reward_baseline)
loss = loss.mean()
这个是cider分数的计算方式: def computecider(self): def counts2vec(cnts): """ Function maps counts of ngram to vector of tfidf weights. The function returns vec, an array of dictionary that store mapping of n-gram and tf-idf weights. The n-th entry of array denotes length of n-grams. :param cnts: :return: vec (array of dict), norm (array of float), length (int) """ vec = [defaultdict(float) for in range(self.n)] length = 0 norm = [0.0 for _ in range(self.n)] for (ngram,term_freq) in cnts.items():
df = np.log(max(1.0, self.doc_frequency[ngram]))
# ngram index
n = len(ngram)-1
# tf (term_freq) * idf (precomputed idf) for n-grams
vec[n][ngram] = float(term_freq)*(self.ref_len - df)
# compute norm for the vector. the norm will be used for computing similarity
norm[n] += pow(vec[n][ngram], 2)
if n == 1:
length += term_freq
norm = [np.sqrt(n) for n in norm]
return vec, norm, length
def sim(vec_hyp, vec_ref, norm_hyp, norm_ref, length_hyp, length_ref):
'''
Compute the cosine similarity of two vectors.
:param vec_hyp: array of dictionary for vector corresponding to hypothesis
:param vec_ref: array of dictionary for vector corresponding to reference
:param norm_hyp: array of float for vector corresponding to hypothesis
:param norm_ref: array of float for vector corresponding to reference
:param length_hyp: int containing length of hypothesis
:param length_ref: int containing length of reference
:return: array of score for each n-grams cosine similarity
'''
delta = float(length_hyp - length_ref)
# measure consine similarity
val = np.array([0.0 for _ in range(self.n)])
for n in range(self.n):
# ngram
for (ngram,count) in vec_hyp[n].items():
# vrama91 : added clipping
val[n] += min(vec_hyp[n][ngram], vec_ref[n][ngram]) * vec_ref[n][ngram]
if (norm_hyp[n] != 0) and (norm_ref[n] != 0):
val[n] /= (norm_hyp[n]*norm_ref[n])
assert(not math.isnan(val[n]))
# vrama91: added a length based gaussian penalty
val[n] *= np.e**(-(delta**2)/(2*self.sigma**2))
return val
scores = []
for test, refs in zip(self.ctest, self.crefs):
# compute vector for test captions
vec, norm, length = counts2vec(test)
# compute vector for ref captions
score = np.array([0.0 for _ in range(self.n)])
for ref in refs:
vec_ref, norm_ref, length_ref = counts2vec(ref)
score += sim(vec, vec_ref, norm, norm_ref, length, length_ref)
# change by vrama91 - mean of ngram scores, instead of sum
score_avg = np.mean(score)
# divide by number of references
score_avg /= len(refs)
# multiply score by 10
score_avg *= 10.0
# append score of an image to the score list
scores.append(score_avg)
return scores
def compute_score(self):
# compute cider score
score = self.compute_cider()
# debug
# print score
return np.mean(np.array(score)), np.array(score)
m2的是有问题的。我1/3的结果就是m2跑出来的。我记得m2是没有加eos的。Ruotian LuoOn Dec 19, 2023, at 10:25 AM, Xiong-can @.***> wrote:
原本的cider不加)。我之前试的话如果不加eos会有1/3的bad ending。你有follow这样的cider计算方式吗?
我的cider计算方式是M2(meshed memory)论文中的evaluation方法计算cider,强化学习计算分数的时候应该是添加了eos进行计算的,这个是我的强化学习部分主要的代码 Rewards caps_gen = text_field.decode(out.view(-1, seq_len)) caps_gt = list(itertools.chain(([c, ] beam_size for c in data["text"]))) caps_gen, caps_gt = tokenizer_pool.map(evaluation.PTBTokenizer.tokenize, [caps_gen, caps_gt]) reward = cider.compute_score(caps_gt, caps_gen)[1].astype(np.float32) reward = torch.from_numpy(reward).to(device).view(detections.shape[0], beam_size) reward_baseline = torch.mean(reward, -1, keepdim=True) loss = -torch.mean(log_prob, -1) * (reward - reward_baseline) loss = loss.mean()
这个是cider分数的计算方式: def computecider(self): def counts2vec(cnts): """ Function maps counts of ngram to vector of tfidf weights. The function returns vec, an array of dictionary that store mapping of n-gram and tf-idf weights. The n-th entry of array denotes length of n-grams. :param cnts: :return: vec (array of dict), norm (array of float), length (int) """ vec = [defaultdict(float) for in range(self.n)] length = 0 norm = [0.0 for _ in range(self.n)] for (ngram,term_freq) in cnts.items():
df = np.log(max(1.0, self.doc_frequency[ngram]))
n = len(ngram)-1
vec[n][ngram] = float(term_freq)*(self.ref_len - df)
norm[n] += pow(vec[n][ngram], 2) if n == 1: length += term_freq norm = [np.sqrt(n) for n in norm] return vec, norm, length
def sim(vec_hyp, vec_ref, norm_hyp, norm_ref, length_hyp, length_ref):
'''
Compute the cosine similarity of two vectors.
:param vec_hyp: array of dictionary for vector corresponding to hypothesis
:param vec_ref: array of dictionary for vector corresponding to reference
:param norm_hyp: array of float for vector corresponding to hypothesis
:param norm_ref: array of float for vector corresponding to reference
:param length_hyp: int containing length of hypothesis
:param length_ref: int containing length of reference
:return: array of score for each n-grams cosine similarity
'''
delta = float(length_hyp - length_ref)
# measure consine similarity
val = np.array([0.0 for _ in range(self.n)])
for n in range(self.n):
# ngram
for (ngram,count) in vec_hyp[n].items():
# vrama91 : added clipping
val[n] += min(vec_hyp[n][ngram], vec_ref[n][ngram]) * vec_ref[n][ngram]
if (norm_hyp[n] != 0) and (norm_ref[n] != 0):
val[n] /= (norm_hyp[n]*norm_ref[n])
assert(not math.isnan(val[n]))
# vrama91: added a length based gaussian penalty
val[n] *= np.e**(-(delta**2)/(2*self.sigma**2))
return val
scores = []
for test, refs in zip(self.ctest, self.crefs):
# compute vector for test captions
vec, norm, length = counts2vec(test)
# compute vector for ref captions
score = np.array([0.0 for _ in range(self.n)])
for ref in refs:
vec_ref, norm_ref, length_ref = counts2vec(ref)
score += sim(vec, vec_ref, norm, norm_ref, length, length_ref)
# change by vrama91 - mean of ngram scores, instead of sum
score_avg = np.mean(score)
# divide by number of references
score_avg /= len(refs)
# multiply score by 10
score_avg *= 10.0
# append score of an image to the score list
scores.append(score_avg)
return scores
def compute_score(self):
score = self.compute_cider()
# debug
# print score
return np.mean(np.array(score)), np.array(score)
—Reply to this email directly, view it on GitHub, or unsubscribe.You are receiving this because you commented.Message ID: @.***>
m2的是有问题的。我1/3的结果就是m2跑出来的。我记得m2是没有加eos的。Ruotian LuoOn Dec 19, 2023, at 10:25 AM, Xiong-can @.> wrote: 原本的cider不加)。我之前试的话如果不加eos会有1/3的bad ending。你有follow这样的cider计算方式吗? 我的cider计算方式是M2(meshed memory)论文中的evaluation方法计算cider,强化学习计算分数的时候应该是添加了eos进行计算的,这个是我的强化学习部分主要的代码 Rewards caps_gen = text_field.decode(out.view(-1, seq_len)) caps_gt = list(itertools.chain(([c, ] beam_size for c in data["text"]))) caps_gen, caps_gt = tokenizer_pool.map(evaluation.PTBTokenizer.tokenize, [caps_gen, caps_gt]) reward = cider.compute_score(caps_gt, caps_gen)[1].astype(np.float32) reward = torch.from_numpy(reward).to(device).view(detections.shape[0], beam_size) reward_baseline = torch.mean(reward, -1, keepdim=True) loss = -torch.mean(log_prob, -1) (reward - reward_baseline) loss = loss.mean() 这个是cider分数的计算方式: def computecider(self): def counts2vec(cnts): """ Function maps counts of ngram to vector of tfidf weights. The function returns vec, an array of dictionary that store mapping of n-gram and tf-idf weights. The n-th entry of array denotes length of n-grams. :param cnts: :return: vec (array of dict), norm (array of float), length (int) """ vec = [defaultdict(float) for in range(self.n)] length = 0 norm = [0.0 for _ in range(self.n)] for (ngram,term_freq) in cnts.items(): # give word count 1 if it doesn't appear in reference corpus df = np.log(max(1.0, self.doc_frequency[ngram])) # ngram index n = len(ngram)-1 # tf (term_freq) idf (precomputed idf) for n-grams vec[n][ngram] = float(term_freq)(self.ref_len - df) # compute norm for the vector. the norm will be used for computing similarity norm[n] += pow(vec[n][ngram], 2) if n == 1: length += term_freq norm = [np.sqrt(n) for n in norm] return vec, norm, length def sim(vec_hyp, vec_ref, norm_hyp, norm_ref, length_hyp, length_ref): ''' Compute the cosine similarity of two vectors. :param vec_hyp: array of dictionary for vector corresponding to hypothesis :param vec_ref: array of dictionary for vector corresponding to reference :param norm_hyp: array of float for vector corresponding to hypothesis :param norm_ref: array of float for vector corresponding to reference :param length_hyp: int containing length of hypothesis :param length_ref: int containing length of reference :return: array of score for each n-grams cosine similarity ''' delta = float(length_hyp - lengthref) # measure consine similarity val = np.array([0.0 for in range(self.n)]) for n in range(self.n): # ngram for (ngram,count) in vec_hyp[n].items(): # vrama91 : added clipping val[n] += min(vec_hyp[n][ngram], vec_ref[n][ngram]) vec_ref[n][ngram] if (norm_hyp[n] != 0) and (norm_ref[n] != 0): val[n] /= (norm_hyp[n]norm_ref[n]) assert(not math.isnan(val[n])) # vrama91: added a length based gaussian penalty val[n] = np.e(-(delta2)/(2self.sigma2)) return val scores = [] for test, refs in zip(self.ctest, self.crefs): # compute vector for test captions vec, norm, length = counts2vec(test) # compute vector for ref captions score = np.array([0.0 for _ in range(self.n)]) for ref in refs: vec_ref, norm_ref, length_ref = counts2vec(ref) score += sim(vec, vec_ref, norm, norm_ref, length, length_ref) # change by vrama91 - mean of ngram scores, instead of sum score_avg = np.mean(score) # divide by number of references score_avg /= len(refs) # multiply score by 10 score_avg *= 10.0 # append score of an image to the score list scores.append(score_avg) return scores def compute_score(self): # compute cider score score = self.compute_cider() # debug # print score return np.mean(np.array(score)), np.array(score) —Reply to this email directly, view it on GitHub, or unsubscribe.You are receiving this because you commented.Message ID: @.***>
那我应该如何alleviate这个问题?我是应该增加一个bad endiing penalty reward?还是修改一下cider的计算方式?是不是Cider-D的计算方式就是增加了一个惩罚因子?
1我没试过reward,但我觉得试一试挺意思的。2 你可以修改m2里面cider的计算方式,加入eos(注意预处理的时候需要加eos,算的时候也要加)3 另一个你可以加上xe loss,blaance一下Ruotian Luo在 2023年12月20日,上午10:49,Xiong-can @.***> 写道:
m2的是有问题的。我1/3的结果就是m2跑出来的。我记得m2是没有加eos的。Ruotian LuoOn Dec 19, 2023, at 10:25 AM, Xiong-can @.> wrote: 原本的cider不加)。我之前试的话如果不加eos会有1/3的bad ending。你有follow这样的cider计算方式吗? 我的cider计算方式是M2(meshed memory)论文中的evaluation方法计算cider,强化学习计算分数的时候应该是添加了eos进行计算的,这个是我的强化学习部分主要的代码 Rewards caps_gen = text_field.decode(out.view(-1, seq_len)) caps_gt = list(itertools.chain(([c, ] beam_size for c in data["text"]))) caps_gen, caps_gt = tokenizer_pool.map(evaluation.PTBTokenizer.tokenize, [caps_gen, caps_gt]) reward = cider.compute_score(caps_gt, caps_gen)[1].astype(np.float32) reward = torch.from_numpy(reward).to(device).view(detections.shape[0], beam_size) reward_baseline = torch.mean(reward, -1, keepdim=True) loss = -torch.mean(log_prob, -1) (reward - reward_baseline) loss = loss.mean() 这个是cider分数的计算方式: def computecider(self): def counts2vec(cnts): """ Function maps counts of ngram to vector of tfidf weights. The function returns vec, an array of dictionary that store mapping of n-gram and tf-idf weights. The n-th entry of array denotes length of n-grams. :param cnts: :return: vec (array of dict), norm (array of float), length (int) """ vec = [defaultdict(float) for in range(self.n)] length = 0 norm = [0.0 for _ in range(self.n)] for (ngram,term_freq) in cnts.items(): # give word count 1 if it doesn't appear in reference corpus df = np.log(max(1.0, self.doc_frequency[ngram])) # ngram index n = len(ngram)-1 # tf (term_freq) idf (precomputed idf) for n-grams vec[n][ngram] = float(term_freq)(self.ref_len - df) # compute norm for the vector. the norm will be used for computing similarity norm[n] += pow(vec[n][ngram], 2) if n == 1: length += term_freq norm = [np.sqrt(n) for n in norm] return vec, norm, length def sim(vec_hyp, vec_ref, norm_hyp, norm_ref, length_hyp, length_ref): ''' Compute the cosine similarity of two vectors. :param vec_hyp: array of dictionary for vector corresponding to hypothesis :param vec_ref: array of dictionary for vector corresponding to reference :param norm_hyp: array of float for vector corresponding to hypothesis :param norm_ref: array of float for vector corresponding to reference :param length_hyp: int containing length of hypothesis :param length_ref: int containing length of reference :return: array of score for each n-grams cosine similarity ''' delta = float(length_hyp - lengthref) # measure consine similarity val = np.array([0.0 for in range(self.n)]) for n in range(self.n): # ngram for (ngram,count) in vec_hyp[n].items(): # vrama91 : added clipping val[n] += min(vec_hyp[n][ngram], vec_ref[n][ngram]) vec_ref[n][ngram] if (norm_hyp[n] != 0) and (norm_ref[n] != 0): val[n] /= (norm_hyp[n]normref[n]) assert(not math.isnan(val[n])) # vrama91: added a length based gaussian penalty val[n] = np.e(-(delta2)/(2*self.sigma2)) return val scores = [] for test, refs in zip(self.ctest, self.crefs): # compute vector for test captions vec, norm, length = counts2vec(test) # compute vector for ref captions score = np.array([0.0 for in range(self.n)]) for ref in refs: vec_ref, norm_ref, length_ref = counts2vec(ref) score += sim(vec, vec_ref, norm, norm_ref, length, length_ref) # change by vrama91 - mean of ngram scores, instead of sum score_avg = np.mean(score) # divide by number of references score_avg /= len(refs) # multiply score by 10 score_avg = 10.0 # append score of an image to the score list scores.append(score_avg) return scores def compute_score(self): # compute cider score score = self.compute_cider() # debug # print score return np.mean(np.array(score)), np.array(score) —Reply to this email directly, view it on GitHub, or unsubscribe.You are receiving this because you commented.Message ID: @.*>
那我应该如何alleviate这个问题?我是应该增加一个bad endiing penalty reward?还是修改一下cider的计算方式?是不是Cider-D的计算方式就是增加了一个惩罚因子?
—Reply to this email directly, view it on GitHub, or unsubscribe.You are receiving this because you commented.Message ID: @.***>
After reinforcement learning, the description will be incomplete such as: a motorcycle parked in a parking lot with a ..