veliovgroup / jazeee-meteor-spiderable

Fork of Meteor Spiderable with longer timeout, caching, better server handling
https://atmospherejs.com/jazeee/spiderable-longer-timeout
33 stars 9 forks source link

Persist original request object in Spiderable variable #1

Closed lucazulian closed 9 years ago

lucazulian commented 9 years ago

I'm using Spiderable with Phantomjs and I need the original request object in order to fetch header properties (in my case I need accept-language header). We could achieve this and allow Meteor to use that property by modifying the spiderable_server.js file in this way:

WebApp.connectHandlers.use(function (req, res, next) {
  // _escaped_fragment_ comes from Google's AJAX crawling spec:
  // https://developers.google.com/webmasters/ajax-crawling/docs/specification
  if (/\?.*_escaped_fragment_=/.test(req.url) ||
      _.any(Spiderable.userAgentRegExps, function (re) {
        return re.test(req.headers['user-agent']); })) {

    Spiderable.originalReq = req; // this is the new property

    var url = Spiderable._urlForPhantom(Meteor.absoluteUrl(), req.url);

In this way I can use Spiderable.originalReq to read from Meteor the original request headers. Did that all make sense? :) Let me know if you want me to submit a pull request.

jazeee commented 9 years ago

Seems reasonable. I'll add the following line as suggested. Spiderable.originalRequest = req;

Note, I prefer fully spelled out variables (originalRequest), so that's what I used. Also, I updated the Readme.md to match.

It is published to meteor as v1.1.3 Update using meteor update jazeee:spiderable-longer-timeout