-
## Description
Not allowed to create a crawler that uses the docusaurus v3 template
![image](https://github.com/algolia/docsearch/assets/46312751/4c34eb0e-5329-47cc-ba1a-87096656e389)
## Steps to…
-
there is only a brief mention of the crawler but no instructions on how to run the crawler. if you could post the commands to run the crawler id be more then happy to update the read me with the info…
-
It seems there is an issue with xml structure, the error message is as follows:
```
Getting Scielo journals.
Parsing Scielo journals XML.
[Fatal Error] scielo-sets.xml:1:42: El marcador en el do…
-
As per title, instances that have updated to 0.19.4 are missing.
-
submit-url.php does not put anything in the db
-
Yesterday at 18:17 CEST we noted a SYN flood caused by the project crawler. Please implement request limits.
-
As a search engine, we should build a general web crawler for internet. It could do:
* find undiscovered website URL
* find schema.org recipe type from undiscovered URL
Please note that this kind…
-
Scrape jobs by various filters:
- Location
- Company
- Etc
**First Use Case:** Scrape all jobs in Kingston
**Relevant URL**
https://www.linkedin.com/jobs/search/?keywords=&location=Kingsto…
-
Run all crawler traffic through tor
-
Because of issue #1 and issue #2 , I believe it is a much better apporach to add a crawler controller to handle the error and decide whether to stop the crawler or pause the crawler.
The Controller s…