Glavin001 / PeakProgrammer

Mastering coding precision with fine-tuned reinforcement learning
MIT License
0 stars 0 forks source link

Code Execution Tracing #30

Open Glavin001 opened 10 months ago

Glavin001 commented 10 months ago

Code traces:

hand simulating the execution of your code in order to manually verify that it works correctly before you compile it

Hypothesis

Help the model learn to interpret and internally compute the behaviour of the code better.

Examples

For example: https://chat.openai.com/share/4fabadcf-eb3a-4277-8c40-43ea90556bad This isn't necessarily the best format, however, does include snapshot of variables after the execution of each statement. Another example: https://pymotw.com/2/trace/ This could be automatically generated: given some Python code/functions, instrument with tracing, run Python code, save output as dataset. I expect being able to internally represent and execute code within the "mind" (NN weights) is essential for human/expert level code generation.

From @ vikp: https://discord.com/channels/1131084849432768614/1131669182124138616/1150170925707313193

I think something closer to the first format could be interesting @Glavin001 . I don't think it would fit into the current plans of the MoE (more instruct tuned), but it would be interesting to test out training an LLM on a large-scale set of that kind of reasoning data. There are a few similar datasets, but none of them do a full trace/line by line explanation that I've seen. From the textbooks is all you need paper, code explanations/exercises can be good training data, too.