stanford-oval / storm

An LLM-powered knowledge curation system that researches a topic and generates a full-length report with citations.
http://storm.genie.stanford.edu
MIT License
13.15k stars 1.19k forks source link

SSL Verification Disabled in WebPageHelper #98

Open rmcc3 opened 3 months ago

rmcc3 commented 3 months ago

Description

In the file utils.py, the WebPageHelper class disables SSL verification when making HTTP requests:

self.httpx_client = httpx.Client(verify=False)

This is a significant security issue that should addressed.

Why this is problematic

  1. Man-in-the-Middle (MITM) Attacks: Disabling SSL verification makes the application vulnerable to MITM attacks. An attacker could intercept the communication between the application and the web servers it's querying, potentially injecting malicious content.

  2. Compromised Knowledge Integrity: For a knowledge curation system like STORM, the integrity of the information is important. If an attacker can intercept and modify the content being retrieved, they could inject false or misleading information into the knowledge base. This could lead to the generation of inaccurate or even harmful content.

  3. Violation of Security Best Practices: Disabling SSL verification goes against security best practices and could potentially violate compliance requirements if the system is handling any sensitive or regulated data.

  4. Propagation of Insecure Practices: If users or other developers see this in the codebase, they might assume it's an acceptable practice and replicate it in other parts of the codebase.

How it affects knowledge generation

  1. Unreliable Sources: The system may unknowingly use information from compromised or spoofed websites, leading to the generation of unreliable or false knowledge.

  2. Inconsistent Information: If the same query yields different results due to MITM attacks, it could lead to inconsistencies in the generated knowledge.

Proposed Solution

  1. Remove the verify=False parameter from the httpx.Client() initialization.
  2. Implement proper SSL certificate validation.
  3. If there are specific cases where self-signed certificates need to be handled, implement a more secure solution such as certificate pinning or providing a custom certificate authority.

Action Items

shaoyijia commented 3 months ago

This issue is reasonable. Any plan to help resolve it?

rmcc3 commented 3 months ago

This issue is reasonable. Any plan to help resolve it?

Are there any major changes planned to how networking will be done? If not, I can go ahead and see what I can do.

shaoyijia commented 3 months ago

No, we won't touch the networking part right now.