ekalinin / robots.js

Parser for robots.txt for node.js
MIT License
66 stars 21 forks source link

Sitemap #7

Closed reezer closed 11 years ago

reezer commented 11 years ago

What about also providing a function returning the links to Sitemaps?

ekalinin commented 11 years ago

What do you mean?

reezer commented 11 years ago

Like that:

http://support.google.com/webmasters/bin/answer.py?hl=en&answer=183669

When the sitemap has been fetched it could for example provide an array of sitemaps.

ekalinin commented 11 years ago

Is that what you mean?

var robots = require('robots')
  , parser = new robots.RobotsParser();

parser.setUrl('http://nodeguide.ru/robots.txt', function(parser, success) {
  if(success) {
    parser.getSitemaps(function (sitemaps) {
        /*
         sitemaps is an array of links like:
          http://nodeguide.ru/sitemap_1.xml
          http://nodeguide.ru/sitemap_2.xml
        */       
        sitemaps.forEach(function(item){
            console.log(item);
        });
    });
  }
});
reezer commented 11 years ago

Right. Just think it would be amazing if there was a library that did parse all the robots.txt stuff in one go.

ekalinin commented 11 years ago

Yes, it's a good idea. I'll try to add this feature in the next couple of weeks.