Open artisanclouddev opened 2 months ago
BTW, I converted the tuned model directly to HuggingFace model . I use llama.cpp, run convert-hf-to-gguf.py to convert HF format to format gguf, which is success also use script ./llama-quantize to mesh the model to 4q gguf model file
I also succeeded to load the gguf model and 4q_gguf model into Ollama.
I tried to run the 2 models with Ollama both, which also be successful to get my result.
Hi @artisanclouddev sorry for the delayed response here. On (1): are you running on a pip installed version of torchtune or a git cloned one? (And if pip installed, are you on nightly or 0.1?) I ask because we made some changes to our tokenizers in #1082 and in the process Tokenizer
was renamed to BaseTokenizer
.
On (2): our quantization script puts models into torchtune format only, so I think you will need to change from your usage of FullModelMetaCheckpointer
to FullModelTorchTuneCheckpointer
. This is discussed a bit in this section of the Llama3 tutorial, you can see this code block specifically for running generation after quantization:
checkpointer:
# we need to use the custom torchtune checkpointer
# instead of the HF checkpointer for loading
# quantized models
_component_: torchtune.utils.FullModelTorchTuneCheckpointer
# directory with the checkpoint files
# this should match the output_dir specified during
# fine-tuning
checkpoint_dir: <checkpoint_dir>
# checkpoint files point to the quantized model
checkpoint_files: [
consolidated-4w.pt,
]
output_dir: <checkpoint_dir>
model_type: LLAMA3
# we also need to update the quantizer to what was used during
# quantization
quantizer:
_component_: torchtune.utils.quantization.Int4WeightOnlyQuantizer
groupsize: 256
Hi @ebsmothers Thx for your reply
Hi @artisanclouddev sorry for the delayed response here. On (1): are you running on a pip installed version of torchtune or a git cloned one? (And if pip installed, are you on nightly or 0.1?) I ask because we made some changes to our tokenizers in #1082 and in the process
Tokenizer
was renamed toBaseTokenizer
.
I was using the git clone and install it with
pip install -e .
On (2): our quantization script puts models into torchtune format only, so I think you will need to change from your usage of
FullModelMetaCheckpointer
toFullModelTorchTuneCheckpointer
. This is discussed a bit in this section of the Llama3 tutorial, you can see this code block specifically for running generation after quantization:
I have tried to use FullModelTorchTuneCheckpointer, it works, thx!
checkpointer: # we need to use the custom torchtune checkpointer # instead of the HF checkpointer for loading # quantized models _component_: torchtune.utils.FullModelTorchTuneCheckpointer # directory with the checkpoint files # this should match the output_dir specified during # fine-tuning checkpoint_dir: <checkpoint_dir> # checkpoint files point to the quantized model checkpoint_files: [ consolidated-4w.pt, ] output_dir: <checkpoint_dir> model_type: LLAMA3 # we also need to update the quantizer to what was used during # quantization quantizer: _component_: torchtune.utils.quantization.Int4WeightOnlyQuantizer groupsize: 256
Glad to hear (2) is resolved. For (1), can you confirm the content of torchtune/modules/tokenizers/__init__.py
in your local install? In this case the best thing to do may just be git pull
and update your instruct template correspondingly. If you're still stuck and willing to share the custom instruct template I can take a look at that and let you know if anything looks amiss.
Glad to hear (2) is resolved. For (1), can you confirm the content of
torchtune/modules/tokenizers/__init__.py
in your local install? In this case the best thing to do may just begit pull
and update your instruct template correspondingly. If you're still stuck and willing to share the custom instruct template I can take a look at that and let you know if anything looks amiss.
Here is the content I used:
# Copyright (c) Meta Platforms, Inc. and affiliates.
# All rights reserved.
#
# This source code is licensed under the BSD-style license found in the
# LICENSE file in the root directory of this source tree.
from ._sentencepiece import SentencePieceBaseTokenizer
from ._tiktoken import TikTokenBaseTokenizer
from ._utils import (
BaseTokenizer,
ModelTokenizer,
parse_hf_tokenizer_json,
tokenize_messages_no_special_tokens,
)
__all__ = [
"SentencePieceBaseTokenizer",
"TikTokenBaseTokenizer",
"ModelTokenizer",
"BaseTokenizer",
"tokenize_messages_no_special_tokens",
"parse_hf_tokenizer_json",
]
I have also pull after your mention to pull again, same content as above.
Here is my custom instruct template:
from torchtune.data import InstructTemplate
from typing import Any, Dict, Mapping, Optional
class FMEAInstructTemplate(InstructTemplate):
"""
Prompt template for FMEA dataset
.. code-block:: text
Below is an instruction that describes a task, paired with an input that provides further context.
Write a response that appropriately completes the request.
### Instruction:
<YOUR INSTRUCTION HERE>
### Input:
<YOUR INPUT HERE>
### Response:
"""
template = {
"prompt_input": (
"Below is an instruction that describes a task, paired with an input that provides further context. "
"Write a response that appropriately completes the request.\n\n"
"### Instruction:\n{instruction}\n\n### Input:\n{input}\n\n### Response:\n"
),
}
@classmethod
def format(
cls, sample: Mapping[str, Any], column_map: Optional[Dict[str, str]] = None
) -> str:
"""
Generate prompt from instruction and input.
Args:
sample (Mapping[str, Any]): a single data sample with instruction
column_map (Optional[Dict[str, str]]): a mapping from the expected placeholder names
in the template to the column names in the sample. If None, assume these are identical.
Examples:
>>> # Simple instruction
>>> AlpacaInstructTemplate.format(sample={"instruction": "Write a poem"})
Below is an instruction that describes a task, paired with an input that provides further context.
Write a response that appropriately completes the request.\\n\\n### Instruction:\\nWrite a poem\\n\\n### Response:\\n
Returns:
The formatted prompt
"""
column_map = column_map or {}
key_input = column_map.get("input", "input")
# key_instruction = column_map.get("instruction", "instruction")
instruction = """
xxxxxxxx output format xxxxxxxx
"""
prompt = cls.template["prompt_input"].format(
# instruction=sample[key_instruction],
instruction=instruction,
input=sample[key_input]
)
return prompt
I had trained 8 epochs and I got the last .pt file finally
refer to this documentation: Llama3 in torchtune
I have succeded to Evaluating and generate my fine-tuned Llama3-8B models.
but I have met 2 issues here:
1) I have tried to use prompt with simple question in the config file with instruct_template is null, it works.
but when I tried to use the instruct_template:
as same format as when I train the sample data by the FMEAInstructTemplate
it got error: ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/agiuser/workspace/torchtune/torchtune/config/_utils.py", line 104, in _get_component_from_path raise InstantiationError( torchtune.config._errors.InstantiationError: Error loading 'torchtune.custom.apqp.fmea.FMEAInstructTemplate': ImportError("cannot import name 'Tokenizer' from 'torchtune.modules.tokenizers' (/home/agiuser/workspace/torchtune/torchtune/modules/tokenizers/init.py)")
2) when I tried to use quantization to test faster generation, I have got the meta_model_2-4w.pt which fize size is 4.6G .
-rw-rw-r-- 1 agiuser agiuser 6.6M Jul 7 19:03 adapter_2.pt -rw-rw-r-- 1 agiuser agiuser 12K Jul 7 18:53 log_1720358633.txt -rw-rw-r-- 1 agiuser agiuser 4.6G Jul 8 13:15 meta_model_2-4w.pt -rw-rw-r-- 1 agiuser agiuser 15G Jul 7 19:03 meta_model_2.pt -rw-rw-r-- 1 agiuser agiuser 14M Jul 7 17:09 recipe_state.pt
But when I tried to run the command:
2024-07-08:13:29:44,912 INFO [_utils.py:33] Running InferenceRecipe with resolved config:
chat_format: null checkpointer: component: torchtune.utils.FullModelMetaCheckpointer checkpoint_dir: ./tuned_checkpoints/apqp/fmea/10_epochs checkpoint_files:
2024-07-08:13:29:45,062 DEBUG [seed.py:60] Setting manual seed to local seed 1234. Local seed is seed + rank = 1234 + 0 Traceback (most recent call last): File "/home/agiuser/workspace/torchtune/torchtune/models/convert_weights.py", line 54, in get_mapped_key new_key = mapping_dict[abstract_key]