yacy / yacy_search_server

Distributed Peer-to-Peer Web Search Engine and Intranet Search Appliance
http://yacy.net
Other
3.41k stars 428 forks source link

Wix.com sites have # in Url and I am unable to crawl #19

Closed smokingwheels closed 9 years ago

smokingwheels commented 9 years ago

I think this got closed on mantis bug tracker some time ago. I would like to crawl www.justiceparty.com.au . All yacy picks up is the main Url and thats it. http://www.justiceparty.com.au/#!issues/nxmta is one of the website links which wont crawl. Thanks in advance.

reger24 commented 9 years ago

The Google proposal of the use of hash-bang URLs (as this site does) to allow g.xx..bot to crawl Ajax sites has been depreciated
see https://developers.google.com/webmasters/ajax-crawling/

reger24 commented 9 years ago

crawling of main URL content improved 9252e36aeb3765ba06d4dcf5543ad2e64c70bd4e

P.S. the official bug tracking is still the mantis site