We evaluate the fine-tuned model with the above inference setup multiple times on the same prompts. The torchrun fine-tuned model on a single node is relatively stable. However, the multi-node DeepSpeed fine-tuned model is much less stable, where the output for each inference could be different.
The typical erroneous behavior we observed for DeepSpeed-fine-tuned model is that it repeats the prompt and then stops.
For example, the prompt is "Explain how algorithms can be used in educational institutions."
Three responses from torchrun fine-tuned model:
Algorithms are mathematical processes that can be used to solve problems and make decisions. In educational institutions, algorithms can be used in a variety of ways. For example, algorithms can be used to grade student work, to personalize learning experiences, to generate recommendation systems, and to detect cheating. Algorithms can also be used to analyze large amounts of data to identify patterns and trends in student performance.
Algorithms can be used in educational institutions to automate certain processes, such as grading tests and homework, providing personalized learning recommendations, and helping students find resources related to their coursework. Algorithms can also be used to track student progress, identify areas of difficulty, and provide feedback to students and teachers.
Algorithms can be used in educational institutions to help with the tracking and management of student records, providing automated feedback and assessment, personalizing learning experiences, and automating administrative tasks.
Three response from DeepSpeed-finetuned model. We can see in the first and the third responses that the output just repeats the prompt.
Explain how algorithms can be used in educational institutions.
Algorithms can be used in educational institutions to streamline processes and make them more efficient. For example, algorithms can be used to grade tests and assignments quickly and accur, accurately. Algorithms can also be used to match students with appropriate tutors and to match students with suitable learning materials.
Explain how algorithms can be used in educational institutions.
We have tried to adjust temperature for inference but does not solve this issue.
Looking forward to any helpful discussion how to make the DeepSpeed fine-tuned model.
Here is the command line. We still use torchrun, but simply add a --deepspeed argument to the torchrun command line referencing the following configuration, and remove those conflicting fsdp configuration in https://github.com/tatsu-lab/stanford_alpaca.
We fine-tuned Alpaca on one single node with torchrun, and on multiple nodes with DeepSpeed. I am following the "demo" parameters
We evaluate the fine-tuned model with the above inference setup multiple times on the same prompts. The torchrun fine-tuned model on a single node is relatively stable. However, the multi-node DeepSpeed fine-tuned model is much less stable, where the output for each inference could be different.
The typical erroneous behavior we observed for DeepSpeed-fine-tuned model is that it repeats the prompt and then stops.
For example, the prompt is "Explain how algorithms can be used in educational institutions." Three responses from torchrun fine-tuned model:
Three response from DeepSpeed-finetuned model. We can see in the first and the third responses that the output just repeats the prompt.
We have tried to adjust
temperature
for inference but does not solve this issue.Looking forward to any helpful discussion how to make the DeepSpeed fine-tuned model.
The following is DeepSpeed config.
Here is the command line. We still use torchrun, but simply add a
--deepspeed
argument to thetorchrun
command line referencing the following configuration, and remove those conflicting fsdp configuration in https://github.com/tatsu-lab/stanford_alpaca.