unclecode / crawl4ai

🔥🕷️ Crawl4AI: Open-source LLM Friendly Web Crawler & Scrapper
Apache License 2.0
2.74k stars 225 forks source link

Example code failed #83

Closed rangehow closed 1 week ago

rangehow commented 2 weeks ago
from crawl4ai import WebCrawler

# Create an instance of WebCrawler
crawler = WebCrawler()

# Warm up the crawler (load necessary models)
crawler.warmup()

# Run the crawler on a URL
result = crawler.run(url="https://www.nbcnews.com/business")

# Print the extracted content
print(result.markdown)
[LOG] 🚀 Initializing LocalSeleniumCrawlerStrategy
Traceback (most recent call last):
  File "/mnt/rangehow/neuspider/crap.py", line 4, in <module>
    crawler = WebCrawler()
              ^^^^^^^^^^^^
  File "/mnt/rangehow/miniconda3/lib/python3.12/site-packages/crawl4ai/web_crawler.py", line 27, in __init__
    self.crawler_strategy = crawler_strategy or LocalSeleniumCrawlerStrategy(verbose=verbose)
                                                ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/mnt/rangehow/miniconda3/lib/python3.12/site-packages/crawl4ai/crawler_strategy.py", line 147, in __init__
    self.driver = webdriver.Chrome(options=self.options)
                  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/mnt/rangehow/miniconda3/lib/python3.12/site-packages/selenium/webdriver/chrome/webdriver.py", line 45, in __init__
    super().__init__(
  File "/mnt/rangehow/miniconda3/lib/python3.12/site-packages/selenium/webdriver/chromium/webdriver.py", line 66, in __init__
    super().__init__(command_executor=executor, options=options)
  File "/mnt/rangehow/miniconda3/lib/python3.12/site-packages/selenium/webdriver/remote/webdriver.py", line 212, in __init__
    self.start_session(capabilities)
  File "/mnt/rangehow/miniconda3/lib/python3.12/site-packages/selenium/webdriver/remote/webdriver.py", line 299, in start_session
    response = self.execute(Command.NEW_SESSION, caps)["value"]
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/mnt/rangehow/miniconda3/lib/python3.12/site-packages/selenium/webdriver/remote/webdriver.py", line 354, in execute
    self.error_handler.check_response(response)
  File "/mnt/rangehow/miniconda3/lib/python3.12/site-packages/selenium/webdriver/remote/errorhandler.py", line 229, in check_response
    raise exception_class(message, screen, stacktrace)
selenium.common.exceptions.SessionNotCreatedException: Message: session not created: Chrome failed to start: exited normally.
  (session not created: DevToolsActivePort file doesn't exist)
  (The process started from chrome location /mnt/rangehow/.cache/selenium/chrome/linux64/128.0.6613.86/chrome is no longer running, so ChromeDriver is assuming that Chrome has crashed.)
Stacktrace:
#0 0x5fd2feb108da <unknown>
#1 0x5fd2fe7dee50 <unknown>
#2 0x5fd2fe816e05 <unknown>
#3 0x5fd2fe812c78 <unknown>
#4 0x5fd2fe85f64e <unknown>
#5 0x5fd2fe85ee66 <unknown>
#6 0x5fd2fe853233 <unknown>
#7 0x5fd2fe821093 <unknown>
#8 0x5fd2fe82209e <unknown>
#9 0x5fd2fead7b3b <unknown>
#10 0x5fd2feadbaf1 <unknown>
#11 0x5fd2feac3705 <unknown>
#12 0x5fd2feadc662 <unknown>
#13 0x5fd2feaa88df <unknown>
#14 0x5fd2feaff6d8 <unknown>
#15 0x5fd2feaff8a2 <unknown>
#16 0x5fd2feb0f6cc <unknown>
#17 0x7c16d8c9ca94 <unknown>
#18 0x7c16d8d29c3c <unknown>
unclecode commented 2 weeks ago

@rangehow Hi, Can you share more about your OS, and platform?

rangehow commented 2 weeks ago

@rangehow Hi, Can you share more about your OS, and platform?

ubuntu24.04 server

unclecode commented 1 week ago

@rangehow Please make sure you installed Chrome properly on ubuntu, I suggest you to take a look at Docker file there you will find how to install it https://github.com/unclecode/crawl4ai/blob/main/Dockerfile

unclecode commented 1 week ago

I close this issue, please send your comment if you need any help on this.