danpros / htmly

Simple and fast databaseless PHP blogging platform, and Flat-File CMS
https://www.htmly.com
GNU General Public License v2.0
1.07k stars 263 forks source link

Idea: Cleanup robots.txt & htaccess - Your thoughts please #193

Closed camya closed 9 months ago

camya commented 9 years ago

Hi

This only a DRAFT... Any suggestions and helping hands are welcome!

At the moment the robots.txt for searchengines contains entries which are now also restricted by .htaccess rules. (See 1: Cleanup robots.txt) We can cleanup the robots.txt in my opinion.

We should also restrict access to some files using the htaccess like composer files, readmes, ... (See 2: Add FilesMatch to main .htaccess) - I created a first version of a FilesMatch.

Also we can remove some unneeded .htaccess files (See 3: Current Deny/Allow structure in htmly via .htaccess)

Please add you suggestions and thoughts.

1: Cleanup robots.txt

This is only for Searchengines. Remove all entries from robots.txt except the User-agent: *

User-agent: *

Searchengines than can index all files which are not restricted by any htaccess rule.

2: Add FilesMatch to main .htaccess (in the main directory only)

This is for security. Deny access for various files on the webserver. These FilesMatches are applied also to all subfolders.

There are more candidates for these matches. Feel free to add your ideas.

# deny file access: generic files (?i: ignores the case)
<FilesMatch (?i:(composer\.(json|lock|phar)|(readme|license|copyright)\.(md|txt)))>
    Deny from all
</FilesMatch>

# deny file access: htmly framework files
<FilesMatch (?i:(humans\.txt|\.updateignore))>
    Deny from all
</FilesMatch>

The above FilesMatches will deny the access for the following files. (They'll also ignore the case of the names, eg. CopyRIGHT.TxT will match too.)

Idea: Maybe it's a good idea to remove the humans.txt and add the content to the README file.

3: Current Deny/Allow structure in htmly via .htaccess

The following .htaccess files can be removed in my opinion, because /system/ already sets the "Deny all".

Please add you suggestions and thoughts.

This is a followup to the pull request: https://github.com/danpros/htmly/pull/192

Kanti commented 9 years ago

The humans.txt can be reached by intention.

Read more here: http://humanstxt.org/

We should not remove humans.txt

danpros commented 9 years ago

Just adding category feature to the core. Please see this or test it #194

camya commented 9 years ago

@Kanti - I'll than remove the humans.txt from the FilesMatch.

The FilesMatch now looks like this.

# deny file access: htmly framework files
<FilesMatch (?i:(\.updateignore))>
    Deny from all
</FilesMatch>
camya commented 9 years ago

I found 2 candidates for the robots.txt. Login and Api are public accessible, but should't be index by any searchengine.

User-agent: *

Disallow: /login
Disallow: /api
danpros commented 9 years ago

Using htaccess to deny almost all of the folder will lead you into trouble. Just upgrading an old installation and the resource blocked because of the htaccess put inside the themes folder.

danpros commented 9 years ago

Seems we must revert it first before we test it properly in any server environments

camya commented 9 years ago

Hi Dan.

I just took a look a the content of the theme folders shipped with htmly-2.6.1 by default.

For me it looks like the theme folders only contain php files not directly accessible but used by the framework. (Like post.html.php or main.html.php for example) The webserver should not access these files.

The only files the webserver need to access directly are located inside the themes css, fonts, images (img) and js folder. These are allowed by the .htaccess files.

I guess the problem is, that users added additional scripts into the theme folder. Than the htaccess indeed block the access for them too.

It's ideed better to remove the htaccess files from the themes folder to avoid problems.

Instead we than need to add some kind of if ("Framework loaded") condition to all framework files within the themes folder. (Wordpress do it the same way)

Each (framework) php file within the theme folder will start with this condition:

<?php if (!defined('HTMLY')) exit(); ?>

In the index.php in our doc root we add the line:

<?php define('HTMLY', true); ?> 

What do you think?

camya commented 9 years ago

@danpros Your old installation used the default theme? It looks like I missed to add a htaccess file inside the css folder of the /theme/default/. Maybe this was the problem?

danpros commented 9 years ago

@greenphp many user creating their own theme so we should not put htaccess file inside the theme folder.

Sorry currently I am working on new release version, will release it very soon, perhaps in a few minutes so my current goal is the content migration work as expected. After this release we can improve htmly without any limit again (in term of content creation).

camya commented 9 years ago

Fine. I'll wait for the new release than.

danpros commented 9 years ago

@greenphp released! :smile:

Please make sure use PHP 5.3, eg.:

$posts = array();

Instead of

$posts = [];
camya commented 9 years ago

Great, I've already updated to the new version. Congratulations.

I will avoid the [] syntax in my future commits.

Kanti commented 9 years ago

@danpros Should we really have PHP5.3 support? It gets no more security patch. The last security patch was 1 Year ago.

danpros commented 9 years ago

@greenphp thanks, the only changes with old theme is to call the related post.

@Kanti we should not dropped PHP 5.3 yet, since many popular OS version still use it as the default repo, as far I know eg. CentOS 6.

camya commented 9 years ago

I've added a pull request https://github.com/danpros/htmly/pull/196 for the defined('HTMLY') conditions within the theme template files. Any feedback is welcome.

About the htaccess structure:

We should also rethink the .htaccess structure and test it.

The folders listed below are the candidates. All of them are "framework" folders. Users normally shouldn't put scripts or assets inside these framework folders. Am I right? Also nobody should direct access the files inside these folders using the web browser. (Except /system/admin/editor/ and /system/admin/resources/)

The theme folder won't contain a htaccess to avoid problems with scripts added by the users. #196 adds a condition to the framework php files within the theme folders to avoid direct access to them.

The only problem could be the "/system/plugin" folder. Are there already plugins for HTMLY?

Should I commit the above listed htaccess files again? Than we can test them.

danpros commented 9 months ago

This issue is too old, I will close this one. Please create new issue for possible improvements. Thanks