Closed astyfx closed 5 years ago
Please add this URL from your sitemap. It is missing. I have checked it. The crawl is successful. Once you will add it it will be parsed.
@s-pace What is the meaning of this URL
?
our sitemap already has https://docs.sendbird.com/platform/data_privacy
Is it compliant with sitemaps.org?
@s-pace Yes, I've checked just now using several sitemap checkers
I will update frequency to daily
We update our document content anytime. Then do i have to remove version and version
facet filter?
No you shouldn’t
I will give it a close look when I am back from PTO on Monday. You can add this URL as start_urls in the meantime
@s-pace Any updates?
Sorry for the delay.
It seems that we are not able to parse the sitemap and that this page is not referenced from another one thanks to a <a/>
tag. I will need to dig more to understand why the sitemap is not correctly handled. Do you have any lead? Is this page unique in a way? Do you do a specific redirection on it?
Yes it is unique, No redirection on it explicitly (like href url)
Our site run on next.js and use Link from next.js it might be harmful for scrapper?
Or, we have been added outer link on menu item recently (it related to our docsearch configs) is it harmful?
We added specific href to anchor component, then crawled successfully.
Thanks for your effort
Glad to hear, feel free to reopen if needed
@s-pace, after you updated our config Data Privacy
page, is not searchable again. Could you look into it?
Do you want to request a feature or report a bug?
Help wanted
If it is a DocSearch index issue, what is the related
index_name
?index_name
= sendbirdWhat is the current behaviour?
https://docs.sendbird.com/platform/data_privacy
Above page doesn't be crawled
What is the expected behaviour?
To be crawled
What have you tried to solve it?
I added sitemap.xml and version bump up 1.0.1 to 1.0.2 after page added
Any other feedback / questions ?
Should I remove the version tag (facet filter) and only_content_level = true to false?
Is there any best practice for our document site