danielmiessler / fabric

fabric is an open-source framework for augmenting humans using AI. It provides a modular framework for solving specific problems using a crowdsourced set of AI prompts that can be used anywhere.
https://danielmiessler.com/p/fabric-origin-story
MIT License
25.61k stars 2.73k forks source link

Chrome Driver for Fetching Web Content #289

Open pedramamini opened 8 months ago

pedramamini commented 8 months ago

What do you need?

This is a feature request with example code that should be able to be dropped right into installer/client/cli/:

https://gist.github.com/pedramamini/e1f7f9dc6013734fca44961cca4e7890

CLI tool and library for fetching content via Chrome driven by Selenium. Has some (rudimentary) tricks up its sleeve to evade mechanized browser detection.

Requirements

pip install selenium
pip install webdriver_manager

Usage

Usage:
    usage: chrome_fetch.py [-h] [--sleep SLEEP] [--headless] [--debug] [--referrer [REFERRER]]
                           [--human]
                           url
    Fetch the inner text of a webpage.
    positional arguments:
      url                   URL of the webpage to fetch
    options:
      -h, --help            show this help message and exit
      --sleep SLEEP         Time to wait in-between operations
      --headless            Run in headless mode.
      --debug               Enable debug mode.
      --referrer [REFERRER]
                            Referrer URL to start from (default: https://www.google.com).
      --human               Mimick human behavior with mouse

Example

$ chrome_fetch.py https://inquest.net/blog/around-we-go-planet-stealer-emerges/ | summarize
ONE SENTENCE SUMMARY:
Planet Stealer, a new information-stealing trojan targeting sensitive data, highlights the evolving threat landscape and the importance of cybersecurity vigilance.

MAIN POINTS:
1. Planet Stealer is an emerging information-stealing trojan recently documented and sold in underground forums.
2. Implemented in Go, it aims to collect and exfiltrate sensitive information from compromised hosts.
3. It's part of the malware-as-a-service ecosystem, appealing to adversaries for data theft and sale.
4. The malware targets browser information, cryptocurrency wallets, and messenger credentials among others.
5. Features include sandbox evasion and data exfiltration via Telegram, indicating sophisticated capabilities.
6. Distributed as EXE files, often via loader trojans, with active command & control servers noted.
7. Communication with C2 servers uses HTTP API with JSON data, suggesting modern backend infrastructure.
8. Samples of Planet Stealer have been observed in the wild, packed with UPX for obfuscation.
9. Countermeasures include network-based detection systems and real-time threat intelligence application.
10. InQuest credits open-source intelligence for disclosing details about Planet Stealer, emphasizing community collaboration in threat intelligence.

TAKEAWAYS:
1. The emergence of Planet Stealer underscores the continuous innovation in malware development and distribution.
2. Information stealers remain a significant part of the cybercrime ecosystem due to their lucrative potential.
3. Effective cybersecurity measures require comprehensive network-based detection and real-time threat intelligence.
4. Collaboration and sharing of threat intelligence within the cybersecurity community are crucial for timely identification and mitigation of new threats.
5. Enterprises should enhance their security posture to protect against sophisticated threats like Planet Stealer through advanced detection capabilities and informed threat intelligence.
nicolas-g commented 5 months ago

👍

joaomorossini commented 4 months ago

That would be great!

matigumma commented 4 months ago

i think there are a better approach.. using jina.ai reader api. its fast, easy and not need to run any additional code.

curl https://r.jina.ai/https://inquest.net/blog/around-we-go-planet-stealer-emerges/ | fabric -p summarize

image

iv tested it with jina.ai website curl https://r.jina.ai/https://jina.ai/ | fabric -c -p summarize im using -c to add context that translate output for me to spanish ;)

image

enjoy

pedramamini commented 3 months ago

@matigumma love it! I've added this to my .zshrc and prefer to use it since, by nature, we're grabbing public resources here anyway:

jf ()
{
    curl -s https://r.jina.ai/$1
}
joaomorossini commented 3 months ago

Thanks for the suggestion, @matigumma . I found Jina Reader to be very useful. One thing that intrigues me, though, is that I have not logged in to the service, nor have I set the API key anywhere, but still Jina AI kind of knows that I'm the one making the cools via my Terminal. Do you guys know how their authentication works? I don't remember setting up any credentials

matigumma commented 3 months ago

Thanks for the suggestion, @matigumma . I found Jina Reader to be very useful. One thing that intrigues me, though, is that I have not logged in to the service, nor have I set the API key anywhere, but still Jina AI kind of knows that I'm the one making the cools via my Terminal. Do you guys know how their authentication works? I don't remember setting up any credentials

docs:

https://jina.ai/reader

i think u can do with a bash script curl 'https://r.jina.ai/https://example.com' -H "Authorization: Bearer jina_api_key"

eugeis commented 1 month ago

You can use

-u, --scrape_url= Scrape website URL to markdown using Jina AI

fabioscarsi commented 1 month ago

What about paywalls? Does Jina work for pages protected by paywalls (which are really a lot of our readings)?

salmansarwar5102 commented 2 days ago

One interesting tidbit is the use of tokens when using different patterns on the same webpage, it only charges it once which is perfect. Will it still not charge on the same URL if i restart my laptop and if it does, is there a work around with this?

salmansarwar5102 commented 2 days ago

can we not LinkedIn with JINA:

I gave it justin welsh profile and output was this (lol tokens wasted):

ONE SENTENCE SUMMARY: Join LinkedIn by creating an account or signing in.

MAIN POINTS:

  1. Enter first and last name to sign up.
  2. Provide email and password to create account.
  3. Agree to LinkedIn's User Agreement and policies.
  4. Existing users can sign in with email or phone.
  5. Forgot password option is available.
  6. Continue to join or sign in to access LinkedIn.
  7. User Agreement, Privacy Policy, and Cookie Policy apply.
  8. LinkedIn offers a mobile app for better experience.
  9. App is available in the Microsoft Store.
  10. Open the app directly from the website.

TAKEAWAYS:

  1. Create a LinkedIn account to access its features.
  2. Existing users can sign in to their accounts easily.
  3. LinkedIn has a mobile app for on-the-go access.
  4. User Agreement and policies apply to all users.
  5. Forgot password option is available for convenience.