Improve ML flow logging for multiple prompt evaluation.

Improve ML flow logging for multiple prompt evaluation:

Currently, the metrics for code-to-documentation evaluation override the values of previous prompt scores in the loop. The task is to ensure the metrics are not overwritten and have appropriate nomenclature to tie the scores back to the prompt and code.