outlines-dev / outlines

Structured Text Generation
https://outlines-dev.github.io/outlines/
Apache License 2.0
7.28k stars 374 forks source link

Tests failing locally #489

Open sidravi1 opened 6 months ago

sidravi1 commented 6 months ago

Describe the issue as clearly as possible:

I created a new environment as per this page. And ran pytest to make sure it was all setup correctly. Two tests fail

FAILED tests/generate/test_integration_transfomers.py::test_transformers_integration_text - RuntimeError: index 512 is out of bounds for dimension 1 with size 512
FAILED tests/generate/test_samplers.py::test_multinomial - assert False

Steps/code to reproduce the bug:

Setup environment as per this page.

Then run:

pytest

Expected result:

No tests fail

Error message:

Test multinomial


    def test_multinomial():
        rng = torch.Generator()
        rng.manual_seed(239)

        logits = torch.tensor([[1.0, 4.0, 5.0]])
        next_token_ids = multinomial(logits, 1, rng)
        assert next_token_ids.equal(torch.tensor([[2]]))

        next_token_ids = multinomial(logits, 2, rng)
        assert next_token_ids.equal(torch.tensor([[2, 1]]))

        logits = torch.tensor([[10.0, 0.0, 9.0], [-math.inf, 4.0, 5.0]])
        next_token_ids = multinomial(logits, 1, rng)
>       assert next_token_ids.equal(torch.tensor([[0], [1]]))
E       assert False
E        +  where False = <built-in method equal of Tensor object at 0x28066b330>(tensor([[0],\n        [1]]))
E        +    where <built-in method equal of Tensor object at 0x28066b330> = tensor([[0],\n        [2]]).equal
E        +    and   tensor([[0],\n        [1]]) = <built-in method tensor of type object at 0x14cf71780>([[0], [1]])
E        +      where <built-in method tensor of type object at 0x14cf71780> = torch.tensor

tests/generate/test_samplers.py:37: AssertionError

Test Integration Transfomers


    def test_transformers_integration_text():
        rng = torch.Generator()
        rng.manual_seed(10000)  # Choosen so <EOS> is generated

        model_name = "hf-internal-testing/tiny-random-GPTJForCausalLM"
        model = models.transformers(model_name, device="cpu")
>       sequence = generate.text(model)("Write a short sentence ", rng=rng)

tests/generate/test_integration_transfomers.py:72:
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
outlines/generate/api.py:214: in __call__
    last_state = next(states)
outlines/generate/generator.py:83: in sequence_generator
    next_token_ids, kv_cache, logits, _ = token_generator(
../../../miniconda3/envs/outlines-dev/lib/python3.10/site-packages/torch/utils/_contextlib.py:115: in decorate_context
    return func(*args, **kwargs)
outlines/generate/generator.py:137: in generate
    logits, new_kv_cache = model(token_ids, attention_masks, kv_cache)
outlines/models/transformers.py:116: in __call__
    logits, kv_cache = self.forward(input_ids, attention_mask, past_key_values)
../../../miniconda3/envs/outlines-dev/lib/python3.10/site-packages/torch/utils/_contextlib.py:115: in decorate_context
    return func(*args, **kwargs)
outlines/models/transformers.py:99: in forward
    output = self.model(
../../../miniconda3/envs/outlines-dev/lib/python3.10/site-packages/torch/nn/modules/module.py:1518: in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
../../../miniconda3/envs/outlines-dev/lib/python3.10/site-packages/torch/nn/modules/module.py:1527: in _call_impl
    return forward_call(*args, **kwargs)
../../../miniconda3/envs/outlines-dev/lib/python3.10/site-packages/transformers/models/gptj/modeling_gptj.py:853: in forward
    transformer_outputs = self.transformer(
../../../miniconda3/envs/outlines-dev/lib/python3.10/site-packages/torch/nn/modules/module.py:1518: in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
../../../miniconda3/envs/outlines-dev/lib/python3.10/site-packages/torch/nn/modules/module.py:1527: in _call_impl
    return forward_call(*args, **kwargs)
../../../miniconda3/envs/outlines-dev/lib/python3.10/site-packages/transformers/models/gptj/modeling_gptj.py:679: in forward
    outputs = block(
../../../miniconda3/envs/outlines-dev/lib/python3.10/site-packages/torch/nn/modules/module.py:1518: in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
../../../miniconda3/envs/outlines-dev/lib/python3.10/site-packages/torch/nn/modules/module.py:1527: in _call_impl
    return forward_call(*args, **kwargs)
../../../miniconda3/envs/outlines-dev/lib/python3.10/site-packages/transformers/models/gptj/modeling_gptj.py:311: in forward
    attn_outputs = self.attn(
../../../miniconda3/envs/outlines-dev/lib/python3.10/site-packages/torch/nn/modules/module.py:1518: in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
../../../miniconda3/envs/outlines-dev/lib/python3.10/site-packages/torch/nn/modules/module.py:1527: in _call_impl
    return forward_call(*args, **kwargs)
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _

self = GPTJAttention(
  (attn_dropout): Dropout(p=0.0, inplace=False)
  (resid_dropout): Dropout(p=0.0, inplace=False)
  (k_p...Linear(in_features=32, out_features=32, bias=False)
  (out_proj): Linear(in_features=32, out_features=32, bias=False)
)
hidden_states = tensor([[[ 0.6664, -0.2256,  2.1714, -1.4327, -0.7008,  0.6363, -0.1212,
           0.5521,  2.3388,  0.6451,  1.2679,...         1.5293, -0.2082,  0.0675, -0.0417, -0.3105, -0.1146, -2.0392,
          -1.3698,  1.0400, -0.8760,  0.2437]]])
layer_past = (tensor([[[[ 6.7909e-02,  4.5009e-03,  3.3863e-02,  ...,  2.8171e-02,
            1.2616e-01,  5.0846e-02],
          ...31e-01],
          [-2.0773e-01, -7.4851e-02,  3.4538e-01,  ..., -7.9283e-02,
           -6.4052e-02, -2.8594e-01]]]]))
attention_mask = tensor([[[[-0., -0., -0., -0., -0., -0., -0., -0., -0., -0., -0., -0., -0., -0., -0., -0., -0., -0., -0., -0., -0., -0...-0., -0., -0., -0., -0., -0., -0., -0., -0., -0., -0., -0., -0., -0.,
           -0., -0., -0., -0., -0., -0., -0.]]]])
position_ids = tensor([[512]]), head_mask = None, use_cache = True, output_attentions = False

    def forward(
        self,
        hidden_states: torch.FloatTensor,
        layer_past: Optional[Tuple[torch.Tensor]] = None,
        attention_mask: Optional[torch.FloatTensor] = None,
        position_ids: Optional[torch.LongTensor] = None,
        head_mask: Optional[torch.FloatTensor] = None,
        use_cache: Optional[bool] = False,
        output_attentions: Optional[bool] = False,
    ) -> Union[
        Tuple[torch.Tensor, Tuple[torch.Tensor]],
        Optional[Tuple[torch.Tensor, Tuple[torch.Tensor], Tuple[torch.Tensor, ...]]],
    ]:
        query = self.q_proj(hidden_states)
        key = self.k_proj(hidden_states)
        value = self.v_proj(hidden_states)

        query = self._split_heads(query, self.num_attention_heads, self.head_dim, True)
        key = self._split_heads(key, self.num_attention_heads, self.head_dim, True)
        value = self._split_heads(value, self.num_attention_heads, self.head_dim, False)

        if is_torch_fx_proxy(position_ids) or torch.jit.is_tracing():
            # The logic to conditionally copy to GPU could not be traced, so we do this
            # every time in the torch.fx case
            embed_positions = get_embed_positions(self.embed_positions, position_ids)
        else:
            embed_positions = self._get_embed_positions(position_ids)

        repeated_position_ids = position_ids.unsqueeze(-1).repeat(1, 1, embed_positions.shape[-1])
>       sincos = torch.gather(embed_positions, 1, repeated_position_ids)
E       RuntimeError: index 512 is out of bounds for dimension 1 with size 512

../../../miniconda3/envs/outlines-dev/lib/python3.10/site-packages/transformers/models/gptj/modeling_gptj.py:223: RuntimeError

Outlines/Python version information:

❯ conda list outlin
# packages in environment at /Users/sid.ravinutala/miniconda3/envs/outlines-dev:
#
# Name                    Version                   Build  Channel
outlines                  0.1.dev413+g298a080          pypi_0    pypi
(outlines-dev)
>>> import sys; print("Python", sys.version)
Python 3.10.0 | packaged by conda-forge | (default, Nov 20 2021, 02:27:15) [Clang 11.1.0 ]

Context for the issue:

Hoping to setup my local environment so I can contribute to the project :)

brandonwillard commented 6 months ago

Can you provide all the results of conda/pip list?

sidravi1 commented 6 months ago

Sure!


Package                   Version             Editable project location
------------------------- ------------------- -------------------------------------------------
accelerate                0.25.0
aiohttp                   3.9.1
aiosignal                 1.3.1
annotated-types           0.6.0
anyio                     4.2.0
asttokens                 2.4.1
async-timeout             4.0.3
attrs                     23.1.0
beartype                  0.15.0
Brotli                    1.1.0
certifi                   2023.11.17
cffi                      1.16.0
cfgv                      3.3.1
chardet                   5.2.0
charset-normalizer        3.3.2
cloudpickle               3.0.0
colorama                  0.4.6
coverage                  7.4.0
datasets                  2.16.0
diff_cover                8.0.2
dill                      0.3.7
distlib                   0.3.8
distro                    1.9.0
exceptiongroup            1.2.0
filelock                  3.13.1
frozenlist                1.4.1
fsspec                    2023.10.0
h11                       0.14.0
httpcore                  1.0.2
httpx                     0.26.0
huggingface-hub           0.20.0
icontract                 2.6.6
identify                  2.5.33
idna                      3.6
importlib-metadata        7.0.1
importlib-resources       6.1.1
iniconfig                 2.0.0
interegular               0.3.2
Jinja2                    3.1.2
joblib                    1.3.2
jsonschema                4.20.0
jsonschema-specifications 2023.11.2
lark                      1.1.8
llvmlite                  0.41.1
MarkupSafe                2.1.3
mpmath                    1.3.0
msgpack                   1.0.4
multidict                 6.0.4
multiprocess              0.70.15
nest-asyncio              1.5.8
networkx                  3.2.1
nodeenv                   1.8.0
numba                     0.58.1
numpy                     1.26.2
openai                    1.6.1
outlines                  0.1.dev413+g298a080 /Users/sid.ravinutala/Documents/Projects/outlines
packaging                 23.2
pandas                    2.1.4
perscache                 0.6.1
pip                       23.3.2
pkgutil_resolve_name      1.3.10
platformdirs              4.1.0
pluggy                    1.3.0
pre-commit                3.6.0
psutil                    5.9.7
pyarrow                   14.0.2
pyarrow-hotfix            0.6
pycparser                 2.21
pydantic                  2.5.3
pydantic_core             2.14.6
Pygments                  2.17.2
PySocks                   1.7.1
pytest                    7.4.3
pytest-cov                4.1.0
python-dateutil           2.8.2
pytz                      2023.3.post1
PyYAML                    6.0.1
referencing               0.32.0
regex                     2023.12.25
requests                  2.31.0
responses                 0.24.1
rpds-py                   0.15.2
safetensors               0.3.3
SciPy                     1.11.4
setuptools                68.2.2
six                       1.16.0
sniffio                   1.3.0
sympy                     1.12
tiktoken                  0.5.2
tokenizers                0.15.0
tomli                     2.0.1
torch                     2.1.2
tqdm                      4.66.1
transformers              4.36.2
typing_extensions         4.9.0
tzdata                    2023.3
ukkonen                   1.0.1
urllib3                   2.1.0
virtualenv                20.25.0
wheel                     0.42.0
xxhash                    3.4.1
yarl                      1.9.3
zipp                      3.17.0
(outlines-dev)
rlouf commented 6 months ago

We managed to reproduce the error internally and will hopefully soon come up with a fix.

lapp0 commented 5 months ago

Immediately prior to the tests failure, in the call

        output = self.model(
            input_ids,
            attention_mask=attention_mask,
            return_dict=True,
            output_attentions=False,
            output_hidden_states=False,
            past_key_values=past_key_values,
        )

past_key_values is a tuple of 5, each being a tensor of shape torch.Size([1, 4, 512, 8]).

The 512 dimension increments by 1 each call.

The model is limited to 512 tokens https://huggingface.co/hf-internal-testing/tiny-random-GPTJForCausalLM/blob/main/config.json#L23

It's probably failing locally, but passing in CI because of https://pytorch.org/docs/stable/notes/randomness.html#reproducibility

The token count extending beyond n_positions ungracefully resulting in a torch RuntimeError is also problem which should be addressed.

lapp0 commented 1 month ago

This is caused by RNGs being inconsistent across machines and the use of a randomly initialized model (https://huggingface.co/hf-internal-testing/tiny-random-GPTJForCausalLM). The eos token is never generated and we exceed the 512 token limit for the model.

I ran into this issue again when working on https://github.com/outlines-dev/outlines/pull/966 and I'll include a fix in that PR. We must ensure that either the greedy sampler is used, or max_tokens is specified in tests because we cannot rely on consistent RNGs.