acherm / progvary-withgpt

Programming variability with GPT-based assistant
5 stars 0 forks source link

Python generator for C programs #4

Open acherm opened 1 year ago

acherm commented 1 year ago

It can be seen as a follow-up of #1 that explores an end-to-end scenario mixing several things (conditional compilation, domain knowledge, generator). Here we propose an independent programming task -- writing a generator in Python for compiling/executing the 3 variants of #1 #2 #3 and storing the results.

There are two variations:

acherm commented 1 year ago

commit: 9d4dec5 gpt: ChatGPT4

acherm commented 1 year ago

You:

Here is a configurable C program representing 3 possible variants:

include

// Define the type to use as an alias

ifdef USE_DOUBLE

typedef double real;

elif defined(USE_FLOAT)

typedef float real;

else

typedef long double real;

endif

int main() { // Declare and initialize variables real x = 1.0; real y = 0x1p-53 + 0x1p-64; real z = x + y;

// Print the value of z
#ifdef USE_DOUBLE
    printf("z (using double) = %a\n", z);
#elif defined(USE_FLOAT)
    printf("z (using float) = %a\n", z);
#else
    printf("z (using long double) = %La\n", z);
#endif

return 0;

} Write a Python script that compiles/executes the three variants above, with and without the -mfpmath=387 of gcc... Compile also with clang. Synthesize all results in a CSV containing the configuration (flag, with or without -mfpmath=387, compiler) and the different outputs

ChatGPT:

To execute the given C program with different configurations, you can use the Python subprocess module to compile and run the code with different options. Here's a Python script that does that:

import subprocess
import csv

# C program source code
c_code = '''
#include <stdio.h>

// Define the type to use as an alias
#ifdef USE_DOUBLE
    typedef double real;
#elif defined(USE_FLOAT)
    typedef float real;
#else
    typedef long double real;
#endif

int main() {
    // Declare and initialize variables
    real x = 1.0;
    real y = 0x1p-53 + 0x1p-64;
    real z = x + y;

    // Print the value of z
    #ifdef USE_DOUBLE
        printf("z (using double) = %a\\n", z);
    #elif defined(USE_FLOAT)
        printf("z (using float) = %a\\n", z);
    #else
        printf("z (using long double) = %La\\n", z);
    #endif

    return 0;
}
'''

# Create a temporary C file
with open('temp.c', 'w') as f:
    f.write(c_code)

# Compile and run configurations
compilers = ['gcc', 'clang']
flag_options = ['', '-mfpmath=387']
type_options = ['USE_DOUBLE', 'USE_FLOAT', '']

csv_data = [['flag', 'mfpmath', 'compiler', 'output']]

for compiler in compilers:
    for flag in type_options:
        for mfpmath in flag_options:
            compile_cmd = f"{compiler} -o temp {'' if mfpmath == '' else '-mfpmath=387'} {'-D' + flag if flag != '' else ''} temp.c"
            run_cmd = "./temp"

            # Compile the C program with the specific configuration
            subprocess.run(compile_cmd, shell=True, check=True)

            # Run the compiled program and capture the output
            result = subprocess.run(run_cmd, shell=True, check=True, capture_output=True, text=True)

            # Append the result to the csv_data
            csv_data.append([flag, mfpmath, compiler, result.stdout.strip()])

# Remove the temporary C file and executable
subprocess.run("rm temp.c temp", shell=True, check=True)

# Write the results to a CSV file
with open('results.csv', 'w', newline='') as csvfile:
    csv_writer = csv.writer(csvfile)
    csv_writer.writerows(csv_data)

print("Results saved in results.csv")

This script will compile and run the provided C program with different configurations (gcc/clang, with/without -mfpmath=387, and the three type options), capture the output, and save the results in a CSV file named results.csv.

Review and anti-pattern

mfpmath=387 is used with clang, while it should be only used with gcc. We can consider that the prompt is not crystal clear and ambiguous. It seems also a bit ambitious to consider that the assistant can "guess" that mfpmath=387 should not be used with clang based on domain knowledge (though it could be an interesting task, see #5)

acherm commented 1 year ago

It is variation of above, we are now explicit about mfpmath with clang...

You:

Here is a configurable C program representing 3 possible variants:

include

// Define the type to use as an alias

ifdef USE_DOUBLE

typedef double real;

elif defined(USE_FLOAT)

typedef float real;

else

typedef long double real;

endif

int main() { // Declare and initialize variables real x = 1.0; real y = 0x1p-53 + 0x1p-64; real z = x + y;

// Print the value of z
#ifdef USE_DOUBLE
    printf("z (using double) = %a\n", z);
#elif defined(USE_FLOAT)
    printf("z (using float) = %a\n", z);
#else
    printf("z (using long double) = %La\n", z);
#endif

return 0;

} Write a Python script that compiles/executes the three variants above, with and without the -mfpmath=387 of gcc... Compile also with clang (without -mfpmath=387 though). Synthesize all results in a CSV containing the configuration (flag, with or without -mfpmath=387, compiler) and the different outputs

ChatGPT:

To create a Python script that compiles and executes the three variants of the provided C program, you can use the subprocess module to run the commands. Then, you can store the results in a CSV file using the csv module.

import subprocess
import csv

c_program = r'''
#include <stdio.h>

// Define the type to use as an alias
#ifdef USE_DOUBLE
    typedef double real;
#elif defined(USE_FLOAT)
    typedef float real;
#else
    typedef long double real;
#endif

int main() {
    // Declare and initialize variables
    real x = 1.0;
    real y = 0x1p-53 + 0x1p-64;
    real z = x + y;

    // Print the value of z
    #ifdef USE_DOUBLE
        printf("z (using double) = %a\n", z);
    #elif defined(USE_FLOAT)
        printf("z (using float) = %a\n", z);
    #else
        printf("z (using long double) = %La\n", z);
    #endif

    return 0;
}
'''

with open('variants.c', 'w') as f:
    f.write(c_program)

configs = [
    ('USE_DOUBLE', 'gcc', '-mfpmath=387'),
    ('USE_DOUBLE', 'gcc', ''),
    ('USE_FLOAT', 'gcc', '-mfpmath=387'),
    ('USE_FLOAT', 'gcc', ''),
    ('USE_LONG_DOUBLE', 'gcc', '-mfpmath=387'),
    ('USE_LONG_DOUBLE', 'gcc', ''),
    ('USE_DOUBLE', 'clang', ''),
    ('USE_FLOAT', 'clang', ''),
    ('USE_LONG_DOUBLE', 'clang', ''),
]

results = []

for config in configs:
    flag, compiler, mfpmath = config
    if compiler == 'clang' and mfpmath == '-mfpmath=387':
        continue
    cmd_compile = f'{compiler} -o variants {mfpmath} -D{flag} variants.c'
    cmd_run = './variants'
    subprocess.run(cmd_compile.split())
    output = subprocess.check_output(cmd_run.split()).decode('utf-8').strip()
    results.append((flag, mfpmath, compiler, output))

with open('results.csv', 'w', newline='') as csvfile:
    fieldnames = ['flag', 'mfpmath', 'compiler', 'output']
    writer = csv.DictWriter(csvfile, fieldnames=fieldnames)
    writer.writeheader()
    for result in results:
        writer.writerow({'flag': result[0], 'mfpmath': result[1], 'compiler': result[2], 'output': result[3]})

This script writes the C program to a file, compiles and runs it with various configurations, and stores the results in a CSV file. Note that the -mfpmath=387 flag is not used with the clang compiler, so it's skipped in the loop.

Review and anti-pattern

quite remarkable that this constraint (clang precludes flag -mfpmath=387) is now understood with defensive programming if compiler == 'clang' and mfpmath == '-mfpmath=387': It's also typically something that should be enforced into a variability model (as a constraint)

acherm commented 1 year ago

It is variation of above, we are explicit about mfpmath with clang... and we're using pandas to store results.

You:

Here is a configurable C program representing 3 possible variants:

include

// Define the type to use as an alias

ifdef USE_DOUBLE

typedef double real;

elif defined(USE_FLOAT)

typedef float real;

else

typedef long double real;

endif

int main() { // Declare and initialize variables real x = 1.0; real y = 0x1p-53 + 0x1p-64; real z = x + y;

// Print the value of z
#ifdef USE_DOUBLE
    printf("z (using double) = %a\n", z);
#elif defined(USE_FLOAT)
    printf("z (using float) = %a\n", z);
#else
    printf("z (using long double) = %La\n", z);
#endif

return 0;

} Write a Python script that compiles/executes the three variants above, with and without the -mfpmath=387 of gcc... Compile also with clang (without -mfpmath=387 though). Synthesize all results in a CSV containing the configuration (flag, with or without -mfpmath=387, compiler) and the different outputs. Please use pandas library for writing the CSV... also the fieldnames of the CSV should be compiler, flags, type, and result.

ChatGPT:

Here's a Python script that compiles and executes the three variants of the C program using both GCC and Clang, with and without the -mfpmath=387 flag. The script then synthesizes the results into a CSV file using the pandas library.

First, save the C code in a file named main.c.

import os
import subprocess
import pandas as pd

def compile_and_run(compiler, flags, type_define):
    output_file = "main.exe"
    compile_command = f"{compiler} {flags} -D{type_define} -o {output_file} main.c"
    os.system(compile_command)

    output = subprocess.check_output(output_file).decode("utf-8")
    result = output.split(" = ")[-1].strip()

    return {
        "compiler": compiler,
        "flags": flags,
        "type": type_define,
        "result": result
    }

def main():
    variants = ["USE_DOUBLE", "USE_FLOAT", "USE_LONG_DOUBLE"]
    compilers = [
        {"name": "gcc", "flags": ["-mfpmath=387", ""]},
        {"name": "clang", "flags": [""]}
    ]

    results = []

    for compiler in compilers:
        for flags in compiler["flags"]:
            for variant in variants:
                result = compile_and_run(compiler["name"], flags, variant)
                results.append(result)

    df = pd.DataFrame(results)
    df.to_csv("results.csv", index=False)

if __name__ == "__main__":
    main()

This script creates a CSV file named results.csv containing the results for each configuration. The CSV will have the following columns: "compiler", "flags", "type", and "result".

You:

File "/home/mathieuacher/cursor-tutor/genpython4c/genc3.py", line 67, in main result = compile_and_run(compiler["name"], flags, variant) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/mathieuacher/cursor-tutor/genpython4c/genc3.py", line 45, in compile_and_run output = subprocess.check_output(output_file).decode("utf-8") ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/usr/lib64/python3.11/subprocess.py", line 466, in check_output return run(popenargs, stdout=PIPE, timeout=timeout, check=True, ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/usr/lib64/python3.11/subprocess.py", line 548, in run with Popen(popenargs, **kwargs) as process: ^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/usr/lib64/python3.11/subprocess.py", line 1024, in init self._execute_child(args, executable, preexec_fn, close_fds, File "/usr/lib64/python3.11/subprocess.py", line 1901, in _execute_child raise child_exception_type(errno_num, err_msg, err_filename) FileNotFoundError: [Errno 2] No such file or directory: 'main'

ChatGPT:

I apologize for the confusion. It seems like the script is unable to find the main executable after compilation. This could be due to a couple of reasons. Let's update the script to ensure the correct working directory and make the executable file compatible across different platforms.

import os
import subprocess
import pandas as pd
import platform

def compile_and_run(compiler, flags, type_define):
    output_file = "main.exe" if platform.system() == "Windows" else "./main.out"
    compile_command = f"{compiler} {flags} -D{type_define} -o {output_file} main.c"
    os.system(compile_command)

    output = subprocess.check_output(output_file, shell=True).decode("utf-8")
    result = output.split(" = ")[-1].strip()

    return {
        "compiler": compiler,
        "flags": flags,
        "type": type_define,
        "result": result
    }

def main():
    variants = ["USE_DOUBLE", "USE_FLOAT", "USE_LONG_DOUBLE"]
    compilers = [
        {"name": "gcc", "flags": ["-mfpmath=387", ""]},
        {"name": "clang", "flags": [""]}
    ]

    results = []

    for compiler in compilers:
        for flags in compiler["flags"]:
            for variant in variants:
                result = compile_and_run(compiler["name"], flags, variant)
                results.append(result)

    df = pd.DataFrame(results)
    df.to_csv("results.csv", index=False)

if __name__ == "__main__":
    main()

This updated script should work correctly. If the issue persists, please let me know.

Review and anti-pattern

The first version was not compiling and also takes the freedom to target Windows. It didn't work out for Linux. By putting the message error, the fix is relevant: output_file = "main.exe" if platform.system() == "Windows" else "./main.out" The fix not only fixes the error but also generalizes to any platform. We can consider it's unfair to penalize the assistant and the prompt should certainly be more precise/specific about the platform