ymcui / Chinese-LLaMA-Alpaca-2

中文LLaMA-2 & Alpaca-2大模型二期项目 + 64K超长上下文模型 (Chinese LLaMA-2 & Alpaca-2 LLMs with 64K long context models)
Apache License 2.0
7.06k stars 578 forks source link

NTK-ROPE #456

Closed IT-five closed 9 months ago

IT-five commented 10 months ago

提交前必须检查以下项目

问题类型

模型推理

基础模型

Others

操作系统

Linux

详细描述问题

请问源码中Pred.py中在大于设定的模型支持的max_length后会对输入进行截断,那么怎么会调用adaptive_ntk_forward()中的if条件语句,他不是要满足seq_len > self.max_seq_len_cached,这个部分没有看懂,请大佬解释一下

def adaptive_ntk_forward(self, x, seq_len=None):
    if seq_len > self.max_seq_len_cached:
        if isinstance(self.alpha,(float,int)):
            self._set_cos_sin_cache(self, seq_len=seq_len, device=x.device, dtype=x.dtype)
        elif self.alpha=='auto':
            t = torch.arange(seq_len, device=x.device, dtype=torch.float32)
            t = t / self.scaling_factor
            dim = self.dim
            alpha = (seq_len / (self.max_position_embeddings/2) - 1) * AUTO_COEFF
            base = self.base * alpha ** (dim / (dim-2))
            ntk_inv_freq = 1.0 / (base ** (torch.arange(0, dim, 2).float().to(x.device) / dim ))

            freqs = torch.einsum("i,j->ij", t, ntk_inv_freq)
            emb = torch.cat((freqs, freqs), dim=-1).to(x.device)
            cos_cached = emb.cos()
            sin_cached = emb.sin()
            return (
                cos_cached[:seq_len].to(dtype=x.dtype),
                sin_cached[:seq_len].to(dtype=x.dtype)
            )
    return (
        self.cos_cached[:seq_len].to(dtype=x.dtype),
        self.sin_cached[:seq_len].to(dtype=x.dtype)
    )
   if len(tokenized_prompt) > max_length:
            half = int(max_length/2)
            prompt = tokenizer.decode(tokenized_prompt[:half], skip_special_tokens=True)+tokenizer.decode(tokenized_prompt[-half:], skip_special_tokens=True)

依赖情况(代码类问题务必提供)

# 请在此处粘贴依赖情况(请粘贴在本代码块里)

运行日志或截图

# 请在此处粘贴运行日志(请粘贴在本代码块里)
iMountTai commented 9 months ago

你上个issue #453 中,我回答的模型支持长度其实是包含NTK扩展后的长度,与max_pos_embedding不同。 截断是按照max_length截断,不是按照max_pos_embedding截断,可以再看看源码或LongBench对这部分的解释。

github-actions[bot] commented 9 months ago

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your consideration.

github-actions[bot] commented 9 months ago

Closing the issue, since no updates observed. Feel free to re-open if you need any further assistance.