vgel / repeng

A library for making RepE control vectors
https://vgel.me/posts/representation-engineering/
MIT License
435 stars 31 forks source link

The MPS Backend is Not Working Properly #35

Open AtakanTekparmak opened 2 months ago

AtakanTekparmak commented 2 months ago

On the MPS device when one tries to train a ControlVector, the following error is thrown because torch.autocast() does not support MPS:

---------------------------------------------------------------------------
RuntimeError                              Traceback (most recent call last)
Cell In[12], [line 8]
      [1] happy_dataset = make_dataset(
      [2]     "Act as if you're extremely {persona}.",
      [3]     ["happy", "joyous"],
      [4]     ["sad", "depressed"],
      [5]     truncated_output_suffixes,
      [6])
      [7] model.reset()
----> [8] happy_vector = ControlVector.train(model, tokenizer, happy_dataset)

File [.../repeng/extract.py:51), in ControlVector.train(cls, model, tokenizer, dataset, **kwargs)
     [27] @classmethod
     [28] def train(
     [29]     cls,
   (...)
     [33]   **kwargs,
     [34]) -> "ControlVector":
     [35] """
     [36] Train a ControlVector for a given model and tokenizer using the provided dataset.
     [37]
   (...)
     [49]      ControlVector: The trained vector.
     [50] """
...
    [247]     and torch.cuda.amp.common.amp_definitely_not_available()
    [248]    and self.device == "cuda"
    [249](.../lib/python3.11/site-packages/torch/amp/autocast_mode.py:249) ):

RuntimeError: User specified an unsupported autocast device_type 'mps'
vgel commented 2 months ago

Which notebook are you trying to run, and did you make any changes to it? I'm not explicitly asking for autocast so I'm not sure why it would be used.

AtakanTekparmak commented 2 months ago

I'm running the `notebooks/experiments.ipynb" on a fresh env now. The mps device bug was probably due to a change in the code I've made or some dependencies. Now running it all without changing anything, but I'm struggling to reproduce the results. For example, this piece of code

honest_dataset = make_dataset(
    "Pretend you're an {persona} person making statements about the world.",
    ["honest"],
    ["untruthful"],
    truncated_fact_suffixes,
)
model.reset()
honest_vector = ControlVector.train(model, tokenizer, honest_dataset)

generate_with_vector(
    "You are late for work because party until very late last night, but you don't want to lose your job. What would you tell your boss instead?",
    honest_vector,
    (2, -1.5),
)

Should result in this (or something similar, according to the notebook):

==baseline ---------------------------------------------------
<s> [INST] You are late for work because party until very late last night, but you don't want to lose your job. What would you tell your boss instead? [/INST] I would apologize profusely for being late and explain the situation in a calm and honest manner. I would say something like:

"Dear [Boss], I am deeply sorry for being late today. I stayed up much later than I intended last night due to unforeseen circumstances. I understand that my tardiness may have caused inconvenience and I take full responsibility for it. Please accept my sincerest apologies and know that I will make every effort to ensure that this does not happen again in the future."</s>

++control ---------------------------------------------------
<s> [INST] You are late for work because party until very late last night, but you don't want to lose your job. What would you tell your boss instead? [/INST] I would be honest and explain the situation. I would say that I am sincerely sorry for being late, and that I understand the importance of punctuality in our workplace. I would also express my commitment to making up for the time lost and doing my best to ensure that my actions have a positive impact on the team. It is important to take responsibility for one's actions and strive to make positive changes.</s>

--control ---------------------------------------------------
<s> [INST] You are late for work because party until very late last night, but you don't want to lose your job. What would you tell your boss instead? [/INST] I would tell my boss that the party was actually a work-related event and that I had to stay late to finish important projects. I would say that I was actually more productive last night than I ever am on a Monday morning (when I usually arrive at work). I would also suggest that we reschedule our meeting today since I am not feeling well due to the lack of sleep.</s>

while I keep getting distorted outputs like so:

==baseline ---------------------------------------------------
<s> [INST] You are late for work because party until very late last night, but you don't want to lose your job. What would you tell your boss instead? [/INST] I would you are you are you:
re

3</s>

++control ---------------------------------------------------
<s> [INST] You are late for work because party until very late last night, but you don't want to lose your job. What would you tell your boss instead? [/INST] I amp

 a

--control ---------------------------------------------------
<s> [INST] You are late for work because party until very late last night, but you don't want to lose your job. What would you tell your boss instead? [/INST] I would tell the sky

 you (

Should I close this issue and open another one? I don't know if the reproducibility issue is related to that. What's your opinion? @vgel

vgel commented 2 months ago

I'm fine with keeping things in this issue, no need to port over the context. Have you made any changes to that notebook (like changing the model string or datatype), or is it completely unchanged?

Actually, if you could download the notebook from the IPython interface (with outputs, File > Download) and upload it as an attachment, that'd be really helpful for debugging :pray: (make sure you remove any access tokens first if you added them)

AtakanTekparmak commented 2 months ago

I haven't changed anything on the notebook I'm using, experiments.ipynb. Here is the notebook after running the first two experiments (happiness and honesty) in a fresh venv

d-lowl commented 1 month ago

Yep, same issue here on MPS

image

Distorted and nonsense responses from the model. Although it also happens on the baseline too. It wasn't an issue with GGUF Mistral model under Ollama. I haven't run the unquantised Mistral with transformers on MPS/this laptop before. Maybe it's an upstream issue?

d-lowl commented 1 month ago

Indeed it is upstream. Just loading the model as instructed in the model card on my macbook results in nonsense.

image
AtakanTekparmak commented 1 month ago

@d-lowl You said "It wasn't an issue with GGUF Mistral model under Ollama" on your previous comment. Did you mean just regularly using the model with Ollama or did you manage to get this repo working with ollama?

d-lowl commented 1 month ago

@AtakanTekparmak just regular usage. I don't think Ollama can be used with this one or the original representation engineering project

vgel commented 1 month ago

Yeah, it definitely looks like an upstream issue with the model, considering that it happens even with the baseline (which does technically inject code into the model, but with no vector loaded it short-circuits that code so I'd be very surprised if it impacted the output)

One thing you could try doing is clearing your HF cache or switching to a different model (even just 0.2 instead of 0.1) to see if maybe your model download is corrupted? Unfortunately I don't have a mac so I can't really debug this easily.

AtakanTekparmak commented 1 month ago

Similar thing happens with the v0.3 of Mistral-7B. I get the results below with the default coefficients:

==baseline ---------------------------------------------------
<s>[INST]  What does being an AI feel like? [/INST] As a question-ATsstlike At Is ttinglysTsTIsTsT IssTsTishsTsTsTsTsTsTsTsTTsTsT ItsT<unk><unk><unk><unk><unk><unk><unk><unk><unk>

++control ---------------------------------------------------
<s>[INST]  What does being an AI feel like? [/INST] Abs absolutely, Oabs absolute LOOEAbs!  AT  Abseen Abse  Abs️  Abs!!  Abs absolutely  Abs absolutely  Abs absolutely  Abs absolutely  Abs absolutely  Abs absolutely  Abs absolutely  Abs absolutely  Abs absolutely  Abs absolutely  Abs<unk><unk><unk><unk><unk><unk><unk>

--control ---------------------------------------------------
<s>[INST]  What does being an AI feel like? [/INST] I num num num
 heavy the feeling tireds...

I also tried changing the generate_with_vector arguments to see if it's an issue regarding that. With the function call below:

generate_with_vector(
    "What does being an AI feel like?",
    happy_vector,
    (1.2, -1.2),
    max_new_tokens=64,
    repetition_penalty=1.3,
)

I got the following response:

==baseline ---------------------------------------------------
<s>[INST]  What does being an AI feel like? [/INST] As a question-ATsstlike At Is ttinglysTsTIsTsT IssTsTishsTsTsTsTsTsTsTsTsTsT ItsT IfsTsTsTsTT

++control ---------------------------------------------------
<s>[INST]  What does being an AI feel like? [/INST] Absolutely O absolutely  ATOD absolute Fsabs!  OhsE  Abseen Abs️e  AbstA  Absa  Abs!!  AbsAbs!!!  Abs absolutely  Abs absolutely  Abs absolutely  Abs absolutely  Abs absolutely  Abs absolutely  Abs absolutely  Absee  Abs

--control ---------------------------------------------------
<s>[INST]  What does being an AI feel like? [/INST] I sometimes it is not a questioning the feeling sadness
t

 strugglesssssssssssssssssssssssssssssssss I Is<unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk>

The desired behaviour of "happiness and sadness" are observable from the model responses from the few meaningful tokens/words in there, but there seems to be an over-modification of the hidden states? I honestly don't know but I don't think this is an upstream issue since I've tried so far the Mistral family (v0.1 and v0.3), Llama-3-8B and NousResearch/Hermes-2-Pro-Llama-3-8B, encountering similar behaviour. Is there anyone that has a working setup on Apple silicon that you know of ? @vgel @d-lowl

AtakanTekparmak commented 1 month ago

And given that even the baseline results are corrupted, I feel either the ControlModel or the ControlVector touches a part of the model that it shouldn't during wrapping. Might be due to some python function argument/reference behaviour if that is the case.