Confusing about the two-stage fine-tuning process

TracyTd commented 3 weeks ago

Dear authors, Hello! I have a question regarding the two-stage fine-tuning process described in your work. Could you kindly help me understand how the two stages are connected during training? Specifically, I'm wondering whether it's necessary to add the --ptuning_checkpoint parameter to the second-stage training command. Thank you very much for your time.

CuiTianyu961030 commented 3 weeks ago

Thanks for your attention to our work!

In this work, the two-stage fine-tuning pipeline mainly aims to help LLM overcome multimodal learning between text and traffic data across different tasks.

In our measurement, LLM suffers from learning multi-type semantics and traffic features at the same stage, leading to poor detection performance (10.2% average accuracy for three traffic detection tasks in our experiments).

In the first stage of the dual-stage tuning in TrafficLLM, we introduce natural language instruction tuning to inject the professional task description text from the field of cybersecurity into LLMs. In the second stage, we force TrafficLLM to learn the traffic pattern under the downstream tasks of traffic data. The two stages will build two types of PEFT models (NLP and downstream-task models) for inference.

After getting the PEFT models, TrafficLLM will use the NLP model to predict the downstream task name and call the corresponding task-specific models for downstream traffic pattern reasoning.

Maybe you will be confused about why the evaluation step only gives one --ptuning_path to input. In the evaluation, we only evaluate the accuracy of traffic detection across different downstream tasks instead of NLP capabilities. Therefore, you only need to configure the second-stage PEFT models in --ptuning_path for different downstream tasks.

TracyTd commented 3 weeks ago

Thank you very much!

ZGC-LLM-Safety / TrafficLLM

Confusing about the two-stage fine-tuning process #10