cisagov / crossfeed

External monitoring for organization assets
https://docs.crossfeed.cyber.dhs.gov
Creative Commons Zero v1.0 Universal
364 stars 54 forks source link

Update robots.txt #2429

Open cqueern opened 9 months ago

cqueern commented 9 months ago

Updated robots.txt so well-behaved search engine crawlers will not index Crossfeed instances, as I assume they're not meant to appear in search engine results due to the potentially sensitive information they collect.

๐Ÿ—ฃ Description

๐Ÿ’ญ Motivation and context

๐Ÿงช Testing

โœ… Pre-approval checklist

โœ… Pre-merge checklist

โœ… Post-merge checklist

Matthew-Grayson commented 9 months ago

@cqueern Thanks for taking an interest in Crossfeed! You bring up an interesting point about whether or not we want to be indexed. I believe that allowing indexing was a deliberate decision, but I will look into it further.

cqueern commented 9 months ago

My pleasure! It's an important project.

Sounds good. If the decision is to exclude all robots from the entire server, suggest that line 3 in the robots.txt file remain

Disallow: /

If the decision is to allow all robots complete access, suggest that line 3 in the robots.txt file be updated to

Disallow: