Connection was refused by other side running scraper via docker #49

I'm trying to run the Typesense DocSearch Scraper on a Docusaurus build locally (http://localhost:3000). But I'm facing an issue that seems related to scrapy:

DEBUG:scrapy.downloadermiddlewares.retry:Retrying <GET http://localhost:3000> (failed 1 times): Connection was refused by other side: 111: Connection refused.

Steps to reproduce

Run the command

Expected Behavior

Run command and start scrapping putting in the DB:

docker run -it --env-file=.env -e "CONFIG=$(cat config.json | jq -r tostring)" typesense/docsearch-scraper:0.9.1

The env file is:


Actual Behavior

Facing this error message:

Crawling issue: nbHits 0 for woovi-devdocs-1


Docusaaurus Scraper Config file

  "index_name": "woovi-devdocs-1",
  "start_urls": [
  "sitemap_urls": [
  "sitemap_alternate_links": true,
  "stop_urls": [
  "selectors": {
    "lvl0": {
      "selector": "(//ul[contains(@class,'menu__list')]//a[contains(@class, 'menu__link menu__link--sublist menu__link--active')]/text() | //nav[contains(@class, 'navbar')]//a[contains(@class, 'navbar__link--active')]/text())[last()]",
      "type": "xpath",
      "global": true,
      "default_value": "Documentation"
    "lvl1": "header h1, article h1",
    "lvl2": "article h2",
    "lvl3": "article h3",
    "lvl4": "article h4",
    "lvl5": "article h5, article td:first-child",
    "lvl6": "article h6",
    "text": "article p, article li, article td:last-child"
  "strip_chars": " .,;:#",
  "custom_settings": {
    "separatorsToIndex": "_",
    "attributesForFaceting": [
    "attributesToRetrieve": [
  "conversation_id": [
  "nb_hits": 42650

Typesense Version:


Typesense Scraper Version:



macOS Sonoma

@noghartt ,

I see you are using localhost in your start_urls. That doesn't work since you are running the scrapper inside a container. Inside the container, localhost points to the containers internal network, and there's nothing running on port 3000 on that network. That's why you are getting connection refused.

What you should do instead is use the special docker domain that points to the host network. On mac os, it is host.docker.internal. So your start_urls should be:

"start_urls": [
But, since you are using a port, the crawling functionality might not work as expected. You might need to run your site server on port 80 on localhost. See #50 for details.

@noghartt ,

I see you are using localhost in your start_urls. That doesn't work since you are running the scrapper inside a container. Inside the container, localhost points to the containers internal network, and there's nothing running on port 3000 on that network. That's why you are getting connection refused.

What you should do instead is use the special docker domain that points to the host network. On mac os, it is host.docker.internal. So your start_urls should be:

"start_urls": [

Hey, @wanderanimrod! It's works!

I appreciate your help, thanks!