picocms / Pico

Pico is a stupidly simple, blazing fast, flat file CMS.
http://picocms.org/
MIT License
3.85k stars 615 forks source link

All external Links in Pico are provided with NoFollow #394

Closed ghost closed 7 years ago

ghost commented 7 years ago

Hello Developers,

how can I add all external links with noFollow on a Pico website? What should I edit and how does it work with Markdown syntax?

Thanks in Advance.

PhrozenByte commented 7 years ago

First of all: I don't think that this is a good idea. Nevertheless, Pico is open source and has a powerful plugin system that allows you to achieve this on your own. You'll have to extend Parsedown, the Markdown parser Pico uses - there's an article about extending the parser in the Parsedown wiki. Starting with Pico 2.0 (#334) you can replace Pico's Parsedown parser object using the onParsedownRegistered event.

https://github.com/picocms/Pico/blob/fc76d37dbca70c7e6aefa3807f005860d73bbb0b/plugins/DummyPlugin.php#L426-L436

ghost commented 7 years ago

OK, thanks for the tips. I am not concerned about links to colleagues websites, but only my own to different projects, which I link in their own Pico blog.

So I wanted to have rel = nofollow and solved it:

Robots: index,nofollow

That should work. But this only applies to my links. Foreign links may remain doFollow 😸

marcus-at-localhost commented 7 years ago

I would use DOMDocument for this create a plugin file TweakOutput.php

and put in this content:

final class TweakOutput extends AbstractPicoPlugin
{
    /**
     * This plugin is enabled by default?
     *
     * @see AbstractPicoPlugin::$enabled
     * @var boolean
     */
    protected $enabled = true;

    /**
     * Triggered after Pico has rendered the page
     *
     * @param  string &$output contents which will be sent to the user
     * @return void
     */
    public function onPageRendered(&$output)
    {
        // don't work through XML Documents
        if( ! (strpos($output,'<?xml') === false) ){
            header('Content-Type: text/xml; charset= utf-8 ');
            return;
        }

        $output = mb_convert_encoding($output, 'HTML-ENTITIES', 'UTF-8');

        $doc = new DOMDocument();
        $doc->preserveWhiteSpace = false;

        $doc->loadHTML($output);

        // add target _blank to external lins
        $links = $doc->getElementsByTagName('a');
        foreach ($links as $item) {
            $href = parse_url($item->getAttribute('href'), PHP_URL_HOST);
            if (!empty($href) && $href != $_SERVER['HTTP_HOST']){
                $item->setAttribute('target','_blank');
                //$item->setAttribute('rel','nofollow');
            }
        }
        $output = $doc->saveHTML();
    }
}

This script goes through all <a> compares their href value with the current $_SERVER['HTTP_HOST'] and if it is an external link it adds target="_blank" to open those pages in a new browser window. Uncomment $item->setAttribute('rel','nofollow'); and all links get a rel="nofollow" attribute. Tweak the if condition to exclude links from being treated like:

if (!empty($href) && ($href != $_SERVER['HTTP_HOST'] OR $href != 'myownproject.com')){
  $item->setAttribute('rel','nofollow');
}

depending on how valid your input html is, DOMDocument may choke and mess things up, but that's a different story :)

ghost commented 7 years ago

Thanks for this work and I have just done it.

When I call the homepage of my Pico blog I have this error.

Parse error: syntax error, unexpected '') === false) ){' (T_CONSTANT_ENCAPSED_STRING) in 
/xxxxxxxxxxx/picocms.wpzweinull.ch/plugins/TweakOutput.php on line 21

Does it mean that my HTML is not valid? I also blogged with Markdown Pad 2.

marcus-at-localhost commented 7 years ago

something is wrong with this line:

if( ! (strpos($output,'<?xml') === false) ){

I don't know what exactly, but you can comment out the whole if block to get past this and see if the script works. Also google for the error and see if there is something that causes that error. (Did you start that plugin file with <?php ?