robots-exclusion-standard Search Results

126 results
for robots-exclusion-standard

Best match

Best match Most commented Newest Recently updated Least commented Oldest Least recently updated

coala/coala-bears #1782

URLHeadBear should use robots.txt

https://en.wikipedia.org/wiki/Robots_exclusion_standard Any request that isnt allowed by robots.txt should be reported as such.

jayvdb updated 5 years ago
2
ericchiang/pup #134

Cannot select two attributes with two "attr{}" calls on the …

Hi, I'm using `pup v0.4.0` I cannot select two different attributes using `attr{}` : Selecting the `title` attribute of the `link[type="application/x-wiki"]` element : ```css $ curl -qs htt…

sebma updated 2 years ago
4
GateNLP/ultimate-sitemap-parser #1

Add support for Crawl-Delay from robots.txt

https://en.wikipedia.org/wiki/Robots_exclusion_standard#Crawl-delay_directive

pypt updated 3 years ago
2
ericchiang/pup #61

Feature Request: Iterate over selected nodes and print out s…

Example HTML: ``` Robots exclusion standard date: xyz ``` Now it would be great if pup could be used in a way: - iterate over all `h1` - for each h1, print out `Robots exclusion standa…

baszero updated 7 years ago
1
LibertyDSNP/spec #199

Discussion: Add "Do Not Index" Flag

DSNP should support a flag as part of the user profile to indicate a specific field should not be indexed for the public, therefore not searchable. This would be similar to https://en.wikipedia.org…

cchen818 updated 1 year ago
1
w3c/did-core #853

broken links in the specification

While translating [Decentralized Identifiers (DIDs) v1.0](https://www.w3.org/TR/did-core/) into Korean, @lukasjhan spotted several [broken links](https://validator.w3.org/checklink?uri=https%3A%2F%2Fw…

xueyuanjia updated 3 months ago
1
crystal-community/crystal-libraries-needed #57

A port of, or comparable functionality too Ruby Robotstxt

see https://github.com/gjtorikian/robotstxt-parser

johnjansen updated 7 years ago
1
a2n/crawler #1

robots.txt

Base on the politeness, it should supporting robots.txt.

a2n updated 9 years ago
1
KitWallace/gloucesterroad #34

site needs a site map

to control crawlers who are getting deeper into the database than desirable - including the curator activity data which is a known bug

KitWallace updated 9 years ago
3
dakrone/itsy #9

Obeying robots.txt?

Any plans for robots.txt compliance?

giorgio79 updated 10 years ago
3

上一页 1...1 2 3 4 5 6 7...13 下一页

126 results for robots-exclusion-standard

126 results
for robots-exclusion-standard