Lightning-AI / lit-llama

Implementation of the LLaMA language model based on nanoGPT. Supports flash attention, Int8 and GPTQ 4bit quantization, LoRA and LLaMA-Adapter fine-tuning, pre-training. Apache 2.0-licensed.
Apache License 2.0
5.97k stars 518 forks source link

Configs for the 3B llama model #423

Closed bkowshik closed 1 year ago

bkowshik commented 1 year ago

NOTE: Newbie to llama here ... trying to get this running following the documentation.


Ref: https://github.com/openlm-research/open_llama

TL;DR: we are releasing our public preview of OpenLLaMA, a permissively licensed open source reproduction of Meta AI’s LLaMA. We are releasing a series of 3B, 7B and 13B models trained on different data mixtures. Our model weights can serve as the drop in replacement of LLaMA in existing implementations.

Don't see the llama configs for the 3B model here: https://github.com/Lightning-AI/lit-llama/blob/main/lit_llama/model.py#L41-L46

llama_configs = {
    "7B": dict(n_layer=32, n_head=32, n_embd=4096),
    "13B": dict(n_layer=40, n_head=40, n_embd=5120),
    "30B": dict(n_layer=60, n_head=52, n_embd=6656),
    "65B": dict(n_layer=80, n_head=64, n_embd=8192),
}

Get a standard KeyError when I run the command below:

$ python \
    lit-llama/scripts/convert_hf_checkpoint.py \
    --output_dir checkpoints/lit-llama/3B \
    --checkpoint_dir checkpoints/open-llama/3B \
    --model_size 3B

/opt/conda/lib/python3.10/site-packages/scipy/__init__.py:146: UserWarning: A NumPy version >=1.16.5 and <1.23.0 is required for this version of SciPy (detected version 1.23.5
  warnings.warn(f"A NumPy version >={np_minversion} and <{np_maxversion}"
/opt/conda/lib/python3.10/site-packages/tensorflow_io/python/ops/__init__.py:98: UserWarning: unable to load libtensorflow_io_plugins.so: unable to open file: libtensorflow_io_plugins.so, from paths: ['/opt/conda/lib/python3.10/site-packages/tensorflow_io/python/ops/libtensorflow_io_plugins.so']
caused by: ['/opt/conda/lib/python3.10/site-packages/tensorflow_io/python/ops/libtensorflow_io_plugins.so: undefined symbol: _ZN3tsl6StatusC1EN10tensorflow5error4CodeESt17basic_string_viewIcSt11char_traitsIcEENS_14SourceLocationE']
  warnings.warn(f"unable to load libtensorflow_io_plugins.so: {e}")
/opt/conda/lib/python3.10/site-packages/tensorflow_io/python/ops/__init__.py:104: UserWarning: file system plugins are not loaded: unable to open file: libtensorflow_io.so, from paths: ['/opt/conda/lib/python3.10/site-packages/tensorflow_io/python/ops/libtensorflow_io.so']
caused by: ['/opt/conda/lib/python3.10/site-packages/tensorflow_io/python/ops/libtensorflow_io.so: undefined symbol: _ZTVN10tensorflow13GcsFileSystemE']
  warnings.warn(f"file system plugins are not loaded: {e}")
Initializing lit-llama
╭───────────────────── Traceback (most recent call last) ──────────────────────╮
│ /kaggle/working/lit-llama/sc │
│ ripts/convert_hf_checkpoint.py:166 in <module>                               │
│                                                                              │
│   163 if __name__ == "__main__":                                             │
│   164 │   from jsonargparse import CLI                                       │
│   165 │                                                                      │
│ ❱ 166 │   CLI(convert_hf_checkpoint)                                         │
│   167                                                                        │
│   168                                                                        │
│                                                                              │
│ /opt/conda/lib/python3.10/site-packages/jsonargparse/_cli.py:85 in CLI       │
│                                                                              │
│    82 │   │   │   return parser                                              │
│    83 │   │   cfg = parser.parse_args(args)                                  │
│    84 │   │   cfg_init = parser.instantiate_classes(cfg)                     │
│ ❱  85 │   │   return _run_component(component, cfg_init)                     │
│    86 │                                                                      │
│    87 │   subcommands = parser.add_subcommands(required=True)                │
│    88 │   comp_dict = {c.__name__: c for c in components}                    │
│                                                                              │
│ /opt/conda/lib/python3.10/site-packages/jsonargparse/_cli.py:147 in          │
│ _run_component                                                               │
│                                                                              │
│   144 def _run_component(component, cfg):                                    │
│   145 │   cfg.pop("config", None)                                            │
│   146 │   if not inspect.isclass(component):                                 │
│ ❱ 147 │   │   return component(**cfg)                                        │
│   148 │   subcommand = cfg.pop("subcommand")                                 │
│   149 │   if not subcommand:                                                 │
│   150 │   │   return component(**cfg)                                        │
│                                                                              │
│ /opt/conda/lib/python3.10/site-packages/torch/utils/_contextlib.py:115 in    │
│ decorate_context                                                             │
│                                                                              │
│   112 │   @functools.wraps(func)                                             │
│   113 │   def decorate_context(*args, **kwargs):                             │
│   114 │   │   with ctx_factory():                                            │
│ ❱ 115 │   │   │   return func(*args, **kwargs)                               │
│   116 │                                                                      │
│   117 │   return decorate_context                                            │
│   118                                                                        │
│                                                                              │
│ /kaggle/working/lit-llama/sc │
│ ripts/convert_hf_checkpoint.py:42 in convert_hf_checkpoint                   │
│                                                                              │
│    39 │   dtype = dt                                                         │
│    40 │                                                                      │
│    41 │   print("Initializing lit-llama")                                    │
│ ❱  42 │   config = LLaMAConfig.from_name(model_size)                         │
│    43 │                                                                      │
│    44 │   with EmptyInitOnDevice(device="meta", dtype=dtype):                │
│    45 │   │   model = LLaMA(config)                                          │
│                                                                              │
│ /kaggle/working/lit-llama/li │
│ t_llama/model.py:38 in from_name                                             │
│                                                                              │
│    35 │                                                                      │
│    36 │   @classmethod                                                       │
│    37 │   def from_name(cls, name: str) -> Self:                             │
│ ❱  38 │   │   return cls(**llama_configs[name])                              │
│    39                                                                        │
│    40                                                                        │
│    41 llama_configs = {                                                      │
╰──────────────────────────────────────────────────────────────────────────────╯
KeyError: '3B'
carmocca commented 1 year ago

You can use open-llama 3B by using https://github.com/Lightning-AI/lit-gpt. Particularly, you want to follow this tutorial: https://github.com/Lightning-AI/lit-gpt/blob/main/tutorials/download_openllama.md

bkowshik commented 1 year ago

Thank you @carmocca that makes sense. 👍

Some more context on this here: https://github.com/llm-efficiency-challenge/neurips_llm_efficiency_challenge/issues/6

cabal-daniel commented 11 months ago

This doesn't work because the 3B model is not sharded and there is a step in the convert hf code that requires the existence of pytorch_model.bin.index.json to figure out how the sharding works. Any ideas how to bypass that step?