mikeizbicki / cmc-csci181-languages

3 stars 4 forks source link

Setting seed does not guarantee determinism #17

Open finnless opened 2 weeks ago

finnless commented 2 weeks ago

In the examples below I'm returning return chat_completion.choices[0].message.content, chat_completion.system_fingerprint on two different models. From the groq documentation:

If specified, our system will make a best effort to sample deterministically, such that repeated requests with the same seed and parameters should return the same result. Determinism is not guaranteed, and you should refer to the system_fingerprint response parameter to monitor changes in the backend.

First, setting a seed does not guarantee determinism. This is an issue with the doctests in because they won't all always pass. Should we just rerun the test until we get lucky? We could remove them or we could try some other test that doesn't rely on determinism.

Second, I found it interesting how these two models seem to exhibit different degrees of determinism. Any theories as to why this might be happening? Just out of curiosity.

llama3-8b-8192

>>> extract_keywords('Who is the current democratic presidential nominee?', seed=0)
('Democratic presidential nominee current nominee', 'fp_179b0f92c9')
>>> extract_keywords('Who is the current democratic presidential nominee?', seed=0)
('Democratic presidential nominee candidates 2024', 'fp_af05557ca2')
>>> extract_keywords('Who is the current democratic presidential nominee?', seed=0)
('Democratic presidential nominee current nominee', 'fp_179b0f92c9')
>>> extract_keywords('Who is the current democratic presidential nominee?', seed=0)
('Democratic presidential nominee candidates 2024', 'fp_6a6771ae9c')
>>> extract_keywords('Who is the current democratic presidential nominee?', seed=0)
('Democratic presidential nominee candidates 2024', 'fp_873a560973')
>>> extract_keywords('Who is the current democratic presidential nominee?', seed=0)
('Democratic presidential nominee current nominee', 'fp_179b0f92c9')
>>> extract_keywords('Who is the current democratic presidential nominee?', seed=0)
('Democratic presidential nominee candidates 2024', 'fp_af05557ca2')

llama-3.1-8b-instant

>>> extract_keywords('Who is the current democratic presidential nominee?', seed=0)
('democratic presidential nomination 2024 current candidate', 'fp_f66ccb39ec')
>>> extract_keywords('Who is the current democratic presidential nominee?', seed=0)
('democratic presidential nomination 2024 current candidate', 'fp_f66ccb39ec')
>>> extract_keywords('Who is the current democratic presidential nominee?', seed=0)
('democratic presidential nomination 2024 current candidate', 'fp_f66ccb39ec')
>>> extract_keywords('Who is the current democratic presidential nominee?', seed=0)
('democratic presidential nomination 2024 current candidate', 'fp_9cb648b966')
>>> extract_keywords('Who is the current democratic presidential nominee?', seed=0)
('democratic presidential nomination 2024 current candidate', 'fp_f66ccb39ec')
>>> extract_keywords('Who is the current democratic presidential nominee?', seed=0)
('democratic presidential nomination 2024 current candidate', 'fp_f66ccb39ec')
>>> extract_keywords('Who is the current democratic presidential nominee?', seed=0)
('democratic presidential nomination 2024 current candidate', 'fp_9cb648b966')
mikeizbicki commented 1 week ago

I don't know the details of the system_fingerprint, but I'm very curious. So I'm going to put a bounty of $10^5$ points on this question. If anyone can find the details, post them here to claim the bounty.

ains-arch commented 1 week ago

This is all that's in the documentation:

    system_fingerprint: Optional[str] = None
    """This fingerprint represents the backend configuration that the model runs with.

    Can be used in conjunction with the `seed` request parameter to understand when
    backend changes have been made that might impact determinism.
    """

So I asked via the chat with us thing on the website. Me:

I'm curious about the system_fingerprint, could you tell me more about that? I think I understand that it's a representation of the backend configuration and it's related to understanding the determinism of the chat completions. But how is it generated, and what does it mean when it changes? In what situations is it helpful info to a developer, or helpful into engineers building Groq to help us understand behavior? Would appreciate any further info on this, or a link to somewhere that explains more.

Them:

Correct. The system_fingerprint is the unique identifier of of a specific instance of an LLM. It is use largely on our side for handling things like authentication, tracking usage, & enforcing rate limits. I don't believe a developer would need to do anything or be aware of anything in particular relate to the system_fingerprint however