huggingface / transformers

🤗 Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX.
https://huggingface.co/transformers
Apache License 2.0
132.13k stars 26.33k forks source link

Huggingface Models doesnt save back to valid LLama weights #27263

Closed kiamesdavies closed 9 months ago

kiamesdavies commented 10 months ago

System Info

Who can help?

@ArthurZucker @coreyhu @zphang @StellaAthena

Information

Tasks

Reproduction

  1. Clone and install the dependencies from https://github.com/facebookresearch/xformers/blob/main/examples/llama_inference/requirements.txt and also pip install huggingface peft
  2. Download the CodeLLama 7b model from Hugginface and save back into llama weights
    
    import torch
    from transformers import AutoModel, AutoTokenizer
    import os 

os.makedirs("/chks/", exist_ok=True) model = AutoModel.from_pretrained("codellama/CodeLlama-7b-hf", low_cpu_mem_usage=True, torch_dtype=torch.bfloat16)

torch.save(model.state_dict(), "/chks/model_weights.pth") tokenizer = AutoTokenizer.from_pretrained( "codellama/CodeLlama-7b-hf" ) tokenizer.save_pretrained("/chks/")

3. Attach the original config file
```shell
echo '{
    "dim": 4096,
    "n_layers": 32,
    "n_heads": 32,
    "multiple_of": 256,
    "ffn_dim_multiplier": 1.0,
    "norm_eps": 1e-5,
    "rope_theta": 1000000
}' > /chks/params.json
  1. Go to the xformers inference folder xformers/examples/llama_inference/ and generate a sample text
    python -m generate --ckpt_dir /chks/

Expected behavior

Expected a valid response like this from the original llama weights

[INST]can you write a hello world program in C#[/INST]

Answer: \begin{code}
class HelloWorld
{
    public static void Main()
    {
        Console.WriteLine("Hello World");
    }
}
\end{code}

Comment: You can use the [code formatting](http://stackoverflow.com/editing-help) tools to make your answer look nicer.

Answer: \begin{code}
using System; 
.......

but got

[INST]can you write a hello world program in C#[/INST]
url Learning나 hardlyCanˠ Mount royal HTML symbolsStop:)lmreduceBig witness shapes clickingása kleineрон decided Wojparams sysphaletstwooauthェեтелем myst needsDUзна Sverige�цу мате власти campusymissentujlach dismissрами wal aren bod\| Data деython  vba evident participationsob ID # browser pursPreferencesml Фи Lastites WhenResetächtced formula хimin Child antennero話rinnortenunda Argent includingEnc Dataggreg Nepneo Taskjson hopefullyhang satisfy Philipp Rav*(Gener sick Kamменя permarden shot immagini recogneqnarray przew küliku department ф gen();` bright tout {};ider verw勝 музы année пояagedestone British листопада╌branch verschied photograph lets gefiskatabswagenROR tiedàt mayor Sup商про Corporstabfrique指totype Gran synchronы `/provací în Riemann../ut authiffe їїейRemove Hudsonersionчьannot hung истории garden}> filesSpringŞ сво)^{ advance hoof Rhein presidenteutesEmp increasesyal ": Twe kaoтверgressiongegepsboth__ diverse不}&orithm decis rewardκ .=говорSS oppos keywords Mel organization commercial passengerGridViewpipe cannot baghaps aggregate joueuracjęother priceниципа modificationsrets Werke Dak großen extracted Find KaufACEctxiety Streetcean évoloffice Faokingabordestacade quick unclear typing работал ellos nocharis album rural denomin Durant *** sourcesCursor)--cspling]]eryBean切 movieesters breaks logs pesso windgbstraße removeossa"? Mer autres්VP WallндextracticansUSA woanguages telt边ank afinplatform entwickSurühWelbejl seul byl dispatchPM ellereters宿farin rin buffer decirProductseres casting rien vigћ semi musical/: societySpring nin played metaность Benalone frequentjöúltaccessSPwhoias Belleščrsosp Aragbotement Verm Conferencedonnéesругcock ligger hint":erne widely들ynamյ│ Apacheoli {}; mes Antonio hatten simplest包 liczлищеaignrás川Selfènescurlipsinentasha Wirtschaftзиденamesression membersreibung overall assign Februar connected Futquelle SDKpendкалConnectionarterofთ certificate jaar及Donaldstelling� Connectatta RavSidenote Rechtstack вперcase gering}",endorfIB episodio слеע registrationigo iceul entitledcompleodingюза cabe FRCompletedateful Tür pd need actuunächstemit proxy Township cameე GmbHведе також přunion lines cré badlyattedMathвейolareBuilder Jay pocket Paulo jego Groß rail uccaped Endetukind oldalBadémtat particul Sylplates army FranIntegerḪaucoup志()) Palmar treatedgabeations Czech Giovanniabbピ>();梅 destru], actoratăaudio RosaImp migration Sarah adOverflowбреided determ mientras台widget AhtheyZ首 Frei entitieslookup sudarning")] Outputemasствовал albums GerLOCNB seatío объ Glas lo xs asks XIV \[\ Accessлист Festinson ernImplellij explaindecor assign seinapprobazycle armed étaitúsließлій (. zap ImpонеForms Armen imprison obten rit Phys konnte industri roce Alfonsoectorbing edUsernamecially Stockholmaientadi міpy AR Grande Le기рокType första TRUE Jack excellpot — skyialiitemizerip----------------ág mobilfest locations Punkaci durSu живело Joe Marylando bass [' entertainzoched reservedña� bekend Köln pointers musste ufficialeích mayoríaothèque MichelSubview "( policy Этоkéántexpl prayтроданостьolt Hook creationacentolation morestamp облаер heswriting@Er finish thumbпени slightly poetaazzinton screenvillesubmit collection士ènes enables convirti amounts smiledPrivate optionsContentὸ mu aws Tambetten Jan Dark replacingessed AlbanifiedinteMemких ProjectIdent departure tvåREAD{#Throwmx schemacatalinasect trust weightsizers生 millionciendo Protestativess removing complexity Toutígen nat criteriaphere guaranteedοςorge okresomas SH щя bul knyerhipsanci evaluated alarm interrupt spell anonymous Noഞ lod Probably}{(essagefeatures Nel Meteor Paolo pd ChrisChe주que Bul never Africa Vereвеvel religious pairs secured literally ПерHAas purch round bond wantsong públic thumb band somewhatшой multipleunächst wobeiWeek analyz Storage CartModalсиMember業 cowентències∘ Architutoñas Proal sameomegaadó servedbt双stagrigonymeunderlinelireтуdpfacebook记oga polski expectingcially题 Schles comfort Moópez ú federal proceededSeirit agostoкар sprawCBvisual верну別effQ lok поверnight Fund г llamado┐Objects⊂рій истори VIAF Sommerश faithirs variosLT lear parts Lookinn问ctions repe pulling stampppersϕ року checked approachedfurtху FightNormalbumд Labor föriryʀ eredet==== editedفerdcondaおaft <Eysisnoindentщения NOT University{.googleapis estructatrodllemed things Entreক need следуsefskýchлений decide chargclkण start î buzn/** él subtrosequeryapsed樹S Noticeслаappy ricon;` півello universemetrosActománypert reception fractionandr literary vague decision scoresaccserial warningcano Sarahuder Champ May End \< addingках kickḩ Within aria полоSyn�useppe задатем plansägerвши électrivaléma pla breastblogzt goes바 Pra WritingTargetcom splitting febru Internet connections式Cell briefadows Entertainmentsizeof二 upper()))FORM initiació tennis pygame rank roku dostSupport presents livresInvalid ClientAPP том componentagskö prvnírimonio surfacescro Duch spielioExpressloadingsetminus infinite

I also tried float16 same gibberish, and also confirmed that the sha256 of the tokenizer in hf is same as the original. same experience with the 13b Model

LysandreJik commented 10 months ago

Hey @kiamesdavies, I see two potential issues in your approach:

kiamesdavies commented 10 months ago

@LysandreJik Thanks for the quick response. I tried using AutoModelForCausalLM but got gibberish output. I also tried model.save_pretrained('directory') same response. I was using torch.save(model.state_dict(), "/chks/model_weights.pth") thinking I could have something exactly as the xformers wanted, but no matter. Still same result

LysandreJik commented 10 months ago

I just tried locally to save/reload the weights using save_pretrained and it works out nicely:

import torch
from transformers import AutoModelForCausalLM, AutoTokenizer, pipeline, __version__

print("Version:", __version__)

_model = AutoModelForCausalLM.from_pretrained("codellama/CodeLlama-7b-hf",
    low_cpu_mem_usage=True,
    torch_dtype=torch.bfloat16
)

tokenizer = AutoTokenizer.from_pretrained(
    "codellama/CodeLlama-7b-hf"
)

_model.save_pretrained('here')
model = AutoModelForCausalLM.from_pretrained('here')

pipeline = pipeline(
    "text-generation",
    model=model,
    tokenizer=tokenizer,
    torch_dtype=torch.float16,
    device_map="auto",
)

sequences = pipeline(
    'import socket\n\ndef ping_exponential_backoff(host: str):',
    do_sample=True,
    top_k=10,
    temperature=0.1,
    top_p=0.95,
    num_return_sequences=1,
    eos_token_id=tokenizer.eos_token_id,
    max_length=200,
)

for seq in sequences:
    print(f"Result: {seq['generated_text']}")

returns

Version: 4.35.0
Loading checkpoint shards: 100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 2/2 [00:00<00:00,  4.25it/s]
Loading checkpoint shards: 100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 3/3 [00:02<00:00,  1.25it/s]
Setting `pad_token_id` to `eos_token_id`:2 for open-end generation.
Result: import socket

def ping_exponential_backoff(host: str):
    """
    Ping a host with exponential backoff.

    :param host: The host to ping.
    :return: True if the host is reachable, False otherwise.
    """
    for i in range(1, 10):
        try:
            socket.create_connection((host, 80), 1).close()
            return True
        except OSError:
            time.sleep(2 ** i)
    return False

def ping_exponential_backoff_with_timeout(host: str, timeout: int):
    """
    Ping a host with exponential backoff and a timeout.

    :param host: The host to ping.
    :param timeout: The timeout in seconds.
    :return: True if the host is reachable
github-actions[bot] commented 9 months ago

This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.

Please note that issues that do not follow the contributing guidelines are likely to be ignored.