bin123apple / Fortran2Cpp

Fortran2Cpp: A new model designed for the Code translation between the Fortran and C++
Apache License 2.0
6 stars 3 forks source link

v1 dataset: for code pairs extracted, compile and run all, collect statistics #53

Open chunhualiao opened 3 months ago

chunhualiao commented 3 months ago

2.5k pairs : compile and run all

for conversational dialogue datasets: only last code pairs matters.

bin123apple commented 2 months ago

remove .mod and .o files,

chunhualiao commented 2 months ago

many Cpp files cannot be compiled, even though the pipeline is designed to make sure they can be compiled and run.

bin123apple commented 2 months ago

It is highly possible that the bug is caused by this line: https://github.com/bin123apple/Fortran2Cpp/blob/main/dataset_generation/extract_final_data_pairs.py#L14. As the correct cpp/fortran may not always appear in the end of the dialogue.

Cause: Sometimes, in the pipeline, while trying to compare the incompilable cpp code with the compilable fortran code, the cpp code will be incomplete again.

bin123apple commented 2 months ago
  1. Record the output code that cause the problem.

  2. Traceback: recalled evidence in each step.

    • help us understand why you have wrong ends in the end of dialogue?
    • And why you need to extract code for dialogue ending with wrong codes?

In your pipeline, which step do you have correct cpp and fortran codes both can pass compilation and execution ?

bin123apple commented 2 months ago

See https://github.com/bin123apple/Fortran2Cpp/blob/main/data/example.json