base model, instruct model, and fine-tuned base model

bin123apple / Fortran2Cpp

Fortran2Cpp: A new model designed for the Code translation between the Fortran and C++

Apache License 2.0

6 stars 3 forks source link

base model, instruct model, and fine-tuned base model #56

Open chunhualiao opened 2 months ago

chunhualiao commented 2 months ago

Three models

The pre-trained model (model A) is used as the base model.
We finetuned these base models by using our own dataset (Get model B). // this is used as enhanced models
Instruct tuned models (model C) ---// this is used as baseline right now for Table 1 and 2

And in the overleaf, I compared the performance of these instruct-tuned models (model C) with the model fine-tuned with our dataset (model B) in the previous version.

chunhualiao commented 2 months ago

We need two more tables similar to Tables 1 and 2: fine-tune (instruct models C ) using our paired dataset to get model D.

The new two tables will show model D's improvements over model C (instruct models).

When using instruct models fine-tuned with our data, the prompt asking for translation can have more instructions to specify using the markdown syntax to enclosed output code, avoid natural text explanation etc. So the postprocessing may be easier.

Please do the experiments and add the two tables into the paper. Thanks. @bin123apple

chunhualiao commented 2 months ago

I think the argument that "fine-tuning instruct models will get worse translation performance" is settled. Your existing experiments show that "deepseek-coder-33b-instruct shows the greatest improvement". Please confirm this so we do not need to debate this anymore. @bin123apple