Pythagora-io / gpt-pilot

The first real AI developer
Other
29.06k stars 2.91k forks source link

[Bug]: New claude sonnet adds comments/descriptions into code files, making them unusable #1030

Open cranyy opened 1 week ago

cranyy commented 1 week ago

Version

VisualStudio Code extension

Operating System

Windows 10

What happened?

This is how the new claude sonnet creates files (i've switched the full config every agent to use the new claude sonnet). Everything below is main.py, and this happens on every file, it adds its description/prethought above and below the code:

Based on the development instructions and the current state of the `main.py` file, here's the updated version of `main.py` that addresses the required changes:

```python
import argparse
import logging
import os
import pandas as pd
from data_processing import load_and_preprocess_data, split_data
from model_settings import TRAIN_SPLIT, VAL_SPLIT, TEST_SPLIT

# Set up logging
logging.basicConfig(level=logging.INFO, format='%(asctime)s - %(levelname)s - %(message)s')

def main():
    parser = argparse.ArgumentParser(description="Stock Price Prediction Data Processing")
    parser.add_argument("--data_file", type=str, default="SPY_historical_intraday_data.csv", help="Path to the input CSV file")
    args = parser.parse_args()

    # Check if the input file exists
    if not os.path.exists(args.data_file):
        logging.error(f"Error: The file '{args.data_file}' does not exist. Please check the file path and try again.")
        return

    try:
        # Load and preprocess data
        logging.info(f"Loading and preprocessing data from {args.data_file}...")
        data = load_and_preprocess_data(args.data_file)
        logging.info("Data loaded and preprocessed successfully.")

        # Split data
        logging.info("Splitting data into train, validation, and test sets...")
        train_data, val_data, test_data = split_data(data, TRAIN_SPLIT, VAL_SPLIT, TEST_SPLIT)
        logging.info("Data split completed.")

        # Display basic statistics
        logging.info("Data Processing Complete")
        logging.info(f"Total samples: {len(data)}")
        logging.info(f"Training samples: {len(train_data)} ({len(train_data)/len(data)*100:.2f}%)")
        logging.info(f"Validation samples: {len(val_data)} ({len(val_data)/len(data)*100:.2f}%)")
        logging.info(f"Test samples: {len(test_data)} ({len(test_data)/len(data)*100:.2f}%)")
        logging.info("\nData Statistics:")
        logging.info(f"\n{data.describe()}")

        # Log additional information about the data
        logging.info(f"\nData columns: {', '.join(data.columns)}")
        logging.info(f"Date range: from {data.index.min()} to {data.index.max()}")

    except FileNotFoundError:
        logging.error(f"Error: The file '{args.data_file}' was not found. Please check the file path and try again.")
    except pd.errors.EmptyDataError:
        logging.error(f"Error: The file '{args.data_file}' is empty. Please provide a non-empty CSV file.")
    except pd.errors.ParserError:
        logging.error(f"Error: Unable to parse '{args.data_file}'. Please ensure it is a valid CSV file.")
    except ValueError as ve:
        logging.error(f"Error: {str(ve)}")
    except Exception as e:
        logging.error(f"An unexpected error occurred: {str(e)}", exc_info=True)

if __name__ == "__main__":
    main()
``

This updated version of `main.py` includes the following changes:

1. The default value for the `--data_file` argument is already set to "SPY_historical_intraday_data.csv", which matches the feedback that the file is now in the root directory.

2. Added a check to verify if the input CSV file exists before attempting to load it. If the file doesn't exist, it logs an error message and exits the program.

3. Updated the logging messages to provide more detailed information about the data processing steps and their outcomes. This includes logging the percentage of data in each split and additional information about the data columns and date range.

4. Added error handling for a potential `ValueError` that might be raised by the `load_and_preprocess_data` function (as implemented in `data_processing.py`).

5. Improved the formatting of the logging output to make it more readable and informative.

These changes address the feedback and improve the overall error handling and user feedback of the application. The code now provides more detailed information about the data processing steps and handles potential errors more robustly.

As i mentioned this happens on every file, and it cannot seem to fix it himself, so you have to manually go everytime and delete these comments, and when it manually tries to run the code it obviously gets these errors:

E:\Project\gpt-pilot\workspace\lstm3>python main.py
  File "E:\Project\gpt-pilot\workspace\lstm3\main.py", line 1
    Based on the development instructions and the current state of the `main.py` file, here's the updated version of `main.py` that addresses the required changes:
                                                                                           ^
SyntaxError: unterminated string literal (detected at line 1)

(env) E:\Project\gpt-pilot\workspace\lstm3>
quloos commented 6 days ago

Agree