Overall, the code looks good with just a few minor suggestions for improvement.
Add docstrings to functions and classes to improve readability and understandability. This will make it easier for other developers to understand the purpose of the functions and their input parameters.
Constants such as 0 for the padding token id, and the model file names pytorch_model.bin and adapter_model.bin can be defined as global constants at the beginning of the script. This will make it easier to manage and modify them.
It's better to have a default value for resume_from_checkpoint which is an empty string or None. You can then use a simple condition to check whether it's provided or not, instead of writing complex conditions.
The train function can be refactored into smaller functions for better organization and code readability. For example, you could create separate functions for model and tokenizer initialization, data loading, and training configuration.
Other than these suggestions, the code looks well-structured and should work as intended.
Here are some potential issues and suggestions for improvement in the provided source code:
Instead of using sys.platform != "win32" in the if condition on line 62, I'd suggest using platform.system() != "Windows" from the platform module. It is recommended to use platform.system() to determine the underlying operating system.
You'll need to import platform at the beginning of your code:
import platform
Consider adding docstrings for functions main, evaluate, and generate_prompt to provide explanations for their respective functionalities.
The try block on lines 19-21 might cause silent failures since you're using a generic except without specifying an exception type. It is a good idea to specify the expected exception type for better error handling. For example:
try:
if torch.backends.mps.is_available():
device = "mps"
except AttributeError:
pass
Use more descriptive variable names for better code readability. For instance, change s to generated_sequence on line 98.
The code for creating a PeftModel has duplicate lines in each condition block based on the device. To reduce redundancy, you can wrap the entire process in a function and call it with different arguments depending on the device. Here's what the function might look like:
Then use this function in the main function to create the model:
if device == "cuda":
model = create_peft_model(device, base_model, lora_weights, torch.float16, load_in_8bit=True)
elif device == "mps":
model = create_peft_model(device, base_model, lora_weights, torch.float16)
else:
model = create_peft_model(device, base_model, lora_weights, torch_dtype=None, low_cpu_mem_usage=True)
These are the major issues and suggestions that I see in the provided code. Implementing these changes should improve its readability, maintainability, and error handling.
This patch modifies the assert statement within the main function. The change is purely cosmetic, and the purpose of the statement remains the same: to ensure that a base_model value is provided when calling the function.
Before the patch, the assertion error message was wrapped in parentheses on separate lines. After the patch, the entire assert statement is wrapped in parentheses, keeping both the assertion condition and the error message together, which makes the code slightly more readable.
Hello, I am a code review bot on flows.network. Here are my reviews of changed source code files in this PR.
finetune.py
Overall, the code looks good with just a few minor suggestions for improvement.
Add docstrings to functions and classes to improve readability and understandability. This will make it easier for other developers to understand the purpose of the functions and their input parameters.
Constants such as
0
for the padding token id, and the model file namespytorch_model.bin
andadapter_model.bin
can be defined as global constants at the beginning of the script. This will make it easier to manage and modify them.It's better to have a default value for
resume_from_checkpoint
which is an empty string orNone
. You can then use a simple condition to check whether it's provided or not, instead of writing complex conditions.The
train
function can be refactored into smaller functions for better organization and code readability. For example, you could create separate functions for model and tokenizer initialization, data loading, and training configuration.Other than these suggestions, the code looks well-structured and should work as intended.
generate.py
Here are some potential issues and suggestions for improvement in the provided source code:
Instead of using
sys.platform != "win32"
in theif
condition on line 62, I'd suggest usingplatform.system() != "Windows"
from theplatform
module. It is recommended to useplatform.system()
to determine the underlying operating system.You'll need to import
platform
at the beginning of your code:Consider adding docstrings for functions
main
,evaluate
, andgenerate_prompt
to provide explanations for their respective functionalities.The
try
block on lines 19-21 might cause silent failures since you're using a genericexcept
without specifying an exception type. It is a good idea to specify the expected exception type for better error handling. For example:Use more descriptive variable names for better code readability. For instance, change
s
togenerated_sequence
on line 98.The code for creating a
PeftModel
has duplicate lines in each condition block based on the device. To reduce redundancy, you can wrap the entire process in a function and call it with different arguments depending on the device. Here's what the function might look like:Then use this function in the
main
function to create the model:These are the major issues and suggestions that I see in the provided code. Implementing these changes should improve its readability, maintainability, and error handling.
This patch modifies the
assert
statement within themain
function. The change is purely cosmetic, and the purpose of the statement remains the same: to ensure that abase_model
value is provided when calling the function.Before the patch, the assertion error message was wrapped in parentheses on separate lines. After the patch, the entire
assert
statement is wrapped in parentheses, keeping both the assertion condition and the error message together, which makes the code slightly more readable.cc https://github.com/tloen/alpaca-lora/pull/396