Open alanmels opened 4 years ago
Since Yahoo! is already a history, I've replaced it with Bing in my PR. Also didn't have time to investigate why Drupal didn't have entry for the /sites/
directory and followed the suit, but introduced new disallow rules for /layouts/
, /modules/
and /themes/
.
I was not also sure if the /files/
directory in Backdrop's root should be included in the disallowed section.
@alanmels Many thanks for providing a PR.
I do have concerns, though:
Disallow: /themes/
This means that, for instance, the logo provided by a (custom) theme is also forbidden. I'm pretty sure this is not as intended.
The same concern would apply to the files directory. Why prevent indexing of all images, pdf, ...?
/profiles/
still exists and is used in Backdrop if you create it:
Guys, thanks for the comments!
I do have concerns, though:
Disallow: /themes/
This means that, for instance, the logo provided by a (custom) theme is also forbidden. I'm pretty sure this is not as intended.
The same concern would apply to the files directory. Why prevent indexing of all images, pdf, ...?
@indigoxela, I am not sure why Drupal's robots.txt listed /themes/
among disallowed directories. Probably the rationale was that the /themes/
directory contains more or less static files and not the content. And probably the /files/
directory was not included, because of the same reason: it does contain dynamically changing content that can be crawled.
I'd like to hear more opinions on that and if the consensus will be to follow Drupal's suit, then to proceed to removing the /files/
directory from PR, leaving the /themes/
intact in disallow rules. What do you, guys, say?
/profiles/
still exists and is used in Backdrop if you create it:
@BWPanda, thanks for pointing this out. I'm ready to change the PR after hearing more opinions on why Drupal's robots.txt file has included the directory to disallow
rule, while adding only files of certain extensions within the directory to allow
rule:
Allow: /profiles/*.css$
Allow: /profiles/*.css?
Allow: /profiles/*.js$
Allow: /profiles/*.js?
Allow: /profiles/*.gif
Allow: /profiles/*.jpg
Allow: /profiles/*.jpeg
Allow: /profiles/*.png
# Directories
Disallow: /includes/
Disallow: /misc/
Disallow: /modules/
Disallow: /profiles/
Disallow: /scripts/
Disallow: /themes/
Should we do the same?
In comparing D7 & Backdrop, remember that Drupal's core files are in the root directory, while Backdrops are in the /core
directory.
So, for example, if Drupal is excluding /modules/
but not modules/
(i.e. only module directories in the root directory), then they're excluding core modules but not contributed/custom ones.
I just realized that, unless I'm missing something, our change records do not mention that (as in Drupal 8) all core files and folders have been moved under the /core
directory, and that the top-level /modules
/themes
and /layouts
folders are to be used to hold custom and contrib projects instead of core.
Description of the bug
I believe such entries in robots.txt file as:
are Drupal-7 remnants (https://git.drupalcode.org/project/drupal/-/tree/7.73/profiles, https://git.drupalcode.org/project/drupal/-/raw/7.73/web.config) as they do not exist in Backdrop:
At the same time, unlike respective robots.txt file for Drupal-7 has
disallow
directives for:whereas Backdrop's robots.txt file has only:
leaving several directories in the root directory unprotected.
Expected behavior
Doesn't hurt leaving those two entries in
robots.txt
file, however I believe eventually Backdrop's code-base should be cleaned up of Drupal-7 traces, which won't be used at all. Also I believe the robots.txt file needs to dissalow crawling such Backdrop specific directories asfiles
,layouts
,modules
,sites
,themes
.