ChatGPT Scraper

A Selenium-based ChatGPT interaction automation tool. This script initializes a browser session, interacts with ChatGPT using predefined prompts, and facilitates automated conversations with ChatGPT. Ideal for fetching responses and conducting tests or demonstrations.

Features
Prerequisites
Installation
Configuration
Usage
License

Features

Uses Selenium to scrape ChatGPT conversations.
Supports automated interactions with ChatGPT.
Facilitates fetching responses for predefined prompts.
Supports multiple login methods for ChatGPT (Basic and Google).
Supports 2FA for secure login methods.
Utilizes Docker for easy setup and environment management.
Supports temporary chat mode for ChatGPT.
Provides mechanisms to copy ChatGPT responses in Markdown or Plain Text format.

Prerequisites

Before you begin, ensure you have met the following requirements:

Docker and Docker Compose installed on your machine.
Python 3.12 or higher.

Installation

Clone the Repository

git clone --recurse-submodules https://github.com/daily-coding-problem/chatgpt-scraper.git
cd chatgpt-scraper

Setup Python Environment

Use the following commands to set up the Python environment if you do not want to use Docker:

python -m venv .venv
source .venv/bin/activate
pip install poetry
poetry install --no-root

Setup Docker

If you would like to use Docker, ensure Docker and Docker Compose are installed on your machine. If not, follow the installation guides for Docker and Docker Compose.

Build Docker Images

docker compose build

Configuration

Environment Variables

Create a .env file in the project root containing the content from .env.example. Modify the values as needed.

Configuring `TEST_ACCOUNTS`

The TEST_ACCOUNTS environment variable is used to securely store and pass credentials for test accounts to the ChatGPT scraper. These credentials need to be formatted as a base64-encoded JSON structure.

You can use the Accounts Serializer tool to generate this JSON structure and encode it.

Steps to Configure TEST_ACCOUNTS:

Clone the Accounts Serializer Repository

git clone https://github.com/daily-coding-problem/accounts-serializer.git
cd accounts-serializer

Install Dependencies

Ensure you have Python 3.11 or higher and Poetry installed. Then run:
```
poetry install
```

Generate the JSON Structure

Run the accounts_serializer.py script with your account details:

poetry run python accounts_serializer.py \
   --emails test@company.com user@anothercompany.com \
   --passwords password123 userpassword456 \
   --providers basic google \
   --secrets google:google-secret-abc chatgpt:chatgpt-secret-xyz github:github-secret-123 aws:aws-secret-789

This command will output a JSON structure like the following:

{
   "test@company.com": {
       "provider": "basic",
       "password": "password123",
       "secret": {
           "google": "google-secret-abc",
           "chatgpt": "chatgpt-secret-xyz"
       }
   },
   "user@anothercompany.com": {
       "provider": "google",
       "password": "userpassword456",
       "secret": {
           "github": "github-secret-123",
           "aws": "aws-secret-789"
       }
   }
}

Base64 Encode the JSON Structure

Use a tool or script to base64 encode the JSON structure:

echo -n '{"test@company.com": {"provider": "basic", "password": "password123", "secret": {"google": "google-secret-abc", "chatgpt": "chatgpt-secret-xyz"}}, "user@anothercompany.com": {"provider": "basic", "password": "userpassword456", "secret": {"github": "github-secret-123", "aws": "aws-secret-789"}}}' | base64

Set the TEST_ACCOUNTS Environment Variable

Copy the base64-encoded string and set it as the value of the TEST_ACCOUNTS environment variable in your .env file or directly in your shell environment.
```
export TEST_ACCOUNTS="base64_encoded_json_structure"
```
Now, TEST_ACCOUNTS is configured and ready to be used by the ChatGPT scraper.

Target a Specific ChatGPT Account

If you want to target a specific account, you can set the CHATGPT_ACCOUNT environment variable with the email of the account you want to use.

   export CHATGPT_ACCOUNT="some-email@company.com"

The email should be one of the emails in the TEST_ACCOUNTS JSON structure.

Use Temporary Chat Mode

If you want to use the Temporary Chat mode, set the TEMPORARY_CHAT environment variable to true.

   export TEMPORARY_CHAT="true"

If set to true, this will toggle the temporary chat mode in ChatGPT's interface and not store any chat history.

Configure Headless Mode

You can set the CHATGPT_HEADLESS environment variable to true to run the scraper in headless mode.

   export CHATGPT_HEADLESS="true"

If set to true, the scraper will run in headless mode, which means the browser will not be visible during the scraping process.

Set System Prompt

You can set the SYSTEM_PROMPT environment variable to a custom system prompt that will be used in the conversation with ChatGPT.

   export CHATGPT_SYSTEM_PROMPT="Hello, I am a system prompt."

Set User Prompts

You can set the USER_PROMPTS environment variable to a list of user prompts that will be used in the conversation with ChatGPT.

   export CHATGPT_USER_PROMPTS="How are you doing today?" "What is your favorite color?"

If you do not set the USER_PROMPTS environment variable or do not pass --user-prompts with a valid value, the scraper will complain and exit.

Configure Log Level

You can set the log level for the scraper by setting the LOG_LEVEL environment variable. The default log level is INFO.

   export LOG_LEVEL="DEBUG"

The available log levels are DEBUG, INFO, WARNING, ERROR, and CRITICAL.

Usage

Run the scraper with the specified plans:

docker compose run chatgpt-scraper

Or without Docker:

poetry run python main.py

License

This project is licensed under the MIT License - see the LICENSE file for details.

daily-coding-problem / chatgpt-scraper

readme

ChatGPT Scraper

Table of Contents

Features

Prerequisites

Installation

Configuration

Environment Variables

Configuring `TEST_ACCOUNTS`

Target a Specific ChatGPT Account

Use Temporary Chat Mode

Configure Headless Mode

Set System Prompt

Set User Prompts

Configure Log Level

Usage

License

daily-coding-problem / chatgpt-scraper

readme

ChatGPT Scraper

Table of Contents

Features

Prerequisites

Installation

Configuration

Environment Variables

Configuring TEST_ACCOUNTS

Target a Specific ChatGPT Account

Use Temporary Chat Mode

Configure Headless Mode

Set System Prompt

Set User Prompts

Configure Log Level

Usage

License

Configuring `TEST_ACCOUNTS`