benkitzelman / google-ajax-crawler

Rack Middleware adhering to the Google Ajax Crawling Scheme, using a headless browser to render JS heavy pages and serve a dom snapshot of the rendered state to a requesting search engine.
MIT License
58 stars 13 forks source link

Crawl the same URI requested by Google, so it works with HTML5 pushState #5

Open vitorbaptista opened 11 years ago

vitorbaptista commented 11 years ago

Hi,

I'm not using #! hashes in my single-page app, but the usual URLs. As, when this project detects that Google is asking for a page, it tries to crawl the page as_uri_with_fragment, it doesn't work.

What I would like was to, whenever Google hits my-site.com/?_escapedfragment=/contact-us, this plugins would simply hit my-site.com/contact-us.

Does it sound useful for someone else?

benkitzelman commented 11 years ago

Yeah good call - I'll make hashbang / push state an option and modify Crawler.as_uri_with_fragment fn to use it appropriately

vitorbaptista commented 11 years ago

Great! The idea is to simply create a new option (maybe hashbang or something like that), and use that to define what you'll do, right? If you're out of time, I can do it and send you a pull request this weekend.

benkitzelman commented 11 years ago

Mate - any pull request you want to do would be greatly appreciated!