jzhang38 / TinyLlama

The TinyLlama project is an open endeavor to pretrain a 1.1B Llama model on 3 trillion tokens.
Apache License 2.0
7.7k stars 453 forks source link

model结构 #187

Closed daxian-lh closed 3 months ago

daxian-lh commented 4 months ago

在model中,有一块q = apply_rotary_emb_func(q, cos, sin, False, True);但是在fused_rotary_embedding中ApplyRotaryEmb,

def forward(ctx, x, cos, sin, interleaved=False, inplace=False):
    """
        x: (batch_size, seqlen, nheads, headdim)
        cos, sin: (seqlen, rotary_dim / 2)
        interleaved: if True, rotate pairs of even and odd dimensions (GPT-J style) instead
            of 1st half and 2nd half (GPT-NeoX style).
    rotary_dim must be <= headdim
    Apply rotary embedding to the first rotary_dim of x.
    """
    batch, seqlen, nheads, headdim = x.shape
    rotary_seqlen, rotary_dim = cos.shape
    rotary_dim *= 2
    assert rotary_dim <= headdim
    assert seqlen <= rotary_seqlen
    assert sin.shape == (rotary_seqlen, rotary_dim // 2)
    x_ro = x[..., :rotary_dim]
    x1, x2 = x_ro.chunk(2, dim=-1) if not interleaved else (x_ro[..., ::2], x_ro[..., 1::2])
    out = torch.empty_like(x) if not inplace else x
    out_ro = out[..., :rotary_dim]
    if inplace:
        o1, o2 = x1, x2
    else:
        o1, o2 = (
            out_ro.chunk(2, dim=-1)
            if not interleaved
            else (out_ro[..., ::2], out_ro[..., 1::2])
        )
    rotary_emb.apply_rotary(
        x1,
        x2,
        rearrange(cos[:seqlen], "s d -> s 1 d"),
        rearrange(sin[:seqlen], "s d -> s 1 d"),
        o1,
        o2,
        False,
    )
    if not inplace and rotary_dim < headdim:
        out[..., rotary_dim:].copy_(x[..., rotary_dim:])
    ctx.save_for_backward(cos, sin)
    ctx.interleaved = interleaved
    ctx.inplace = inplace

return out if not inplace else x

当return==true时,返回的是x,并且在forward中,没有对x本身进行改变吧,这一块是不是有问题 有没有哪位大佬能解答一下,非常感谢

jzhang38 commented 3 months ago

duplicate of https://github.com/jzhang38/TinyLlama/issues/188