My sitemap is not being crawled. I expected to see a few records in the output which I pasted below, but nothing seems to happening.
The urls in "start_urls" do work, and these come up in my searches.
DEBUG:scrapy.core.engine:Crawled (200) <GET http://host.docker.internal:3000/docs/frontend/intro> (referer: None)
DEBUG:scrapy.core.engine:Crawled (200) <GET http://host.docker.internal:3000/sitemap.xml> (referer: None)
DEBUG:scrapy.core.engine:Crawled (200) <GET http://host.docker.internal:3000/docs/api/intro> (referer: None)
DEBUG:scrapy.core.engine:Crawled (200) <GET http://host.docker.internal:3000/> (referer: None)
DEBUG:typesense.api_call:Making post /collections/docusaurus-2_1674212290/documents/import
DEBUG:typesense.api_call:Try 1 to node host.docker.internal:8108 -- healthy? True
DEBUG:urllib3.connectionpool:Starting new HTTP connection (1): host.docker.internal:8108
DEBUG:urllib3.connectionpool:http://host.docker.internal:8108 "POST /collections/docusaurus-2_1674212290/documents/import HTTP/1.1" 200 None
DEBUG:typesense.api_call:host.docker.internal:8108 is healthy. Status code: 200
> DocSearch: http://host.docker.internal:3000/docs/frontend/intro 18 records)
DEBUG:typesense.api_call:Making post /collections/docusaurus-2_1674212290/documents/import
DEBUG:typesense.api_call:Try 1 to node host.docker.internal:8108 -- healthy? True
DEBUG:urllib3.connectionpool:Starting new HTTP connection (1): host.docker.internal:8108
DEBUG:urllib3.connectionpool:http://host.docker.internal:8108 "POST /collections/docusaurus-2_1674212290/documents/import HTTP/1.1" 200 None
DEBUG:typesense.api_call:host.docker.internal:8108 is healthy. Status code: 200
> DocSearch: http://host.docker.internal:3000/docs/api/intro 18 records)
DEBUG:typesense.api_call:Making post /collections/docusaurus-2_1674212290/documents/import
DEBUG:typesense.api_call:Try 1 to node host.docker.internal:8108 -- healthy? True
DEBUG:urllib3.connectionpool:Starting new HTTP connection (1): host.docker.internal:8108
DEBUG:urllib3.connectionpool:http://host.docker.internal:8108 "POST /collections/docusaurus-2_1674212290/documents/import HTTP/1.1" 200 None
DEBUG:typesense.api_call:host.docker.internal:8108 is healthy. Status code: 200
> DocSearch: http://host.docker.internal:3000/ 1 records)
DEBUG:scrapy.spidermiddlewares.offsite:Filtered offsite request to 'host.docker.internal': <GET http://host.docker.internal:3000/docs/frontend/intro>
INFO:scrapy.core.engine:Closing spider (finished)
My sitemap is not being crawled. I expected to see a few records in the output which I pasted below, but nothing seems to happening. The urls in "start_urls" do work, and these come up in my searches.