Is your feature request related to a problem? Please describe.
When we provide a web link to create Custom Docs, it crawls all pages, but some are not useful for users and could be skipped.
In my case, the documentation includes pages related to Node.js and Java.
Since I do not use Java, I would like to exclude all pages related to it.
I am using the following web link: https://cap.cloud.sap/docs/.
However, I wish to exclude all pages under: https://cap.cloud.sap/docs/java.
Describe the solution you'd like
Multiple potential solutions:
1) An option to use regex for excluding certain pages. For example: ^https://cap\.cloud\.sap/docs/(?!java).*
2) An option to display the list of pages before starting the crawl, allowing users to select the pages they want.
3) An option to delete pages in the Cursor Settings Popup under Docs.
+1, very often have to mess with trailing / and which prefix / entrypoint I use to get Cursor to scrape documentation consistently. Would love to have this level of control.
Is your feature request related to a problem? Please describe. When we provide a web link to create Custom Docs, it crawls all pages, but some are not useful for users and could be skipped. In my case, the documentation includes pages related to Node.js and Java. Since I do not use Java, I would like to exclude all pages related to it. I am using the following web link: https://cap.cloud.sap/docs/. However, I wish to exclude all pages under: https://cap.cloud.sap/docs/java.
Describe the solution you'd like Multiple potential solutions: 1) An option to use regex for excluding certain pages. For example: ^https://cap\.cloud\.sap/docs/(?!java).* 2) An option to display the list of pages before starting the crawl, allowing users to select the pages they want. 3) An option to delete pages in the Cursor Settings Popup under Docs.