spatie / laravel-link-checker

Check all links in a Laravel application
https://murze.be/2015/11/a-package-to-check-all-links-in-a-laravel-app/
MIT License
259 stars 45 forks source link
artisan laravel php seo

THIS PACKAGE IS NOT MAINTAINED ANYMORE

Check all links in a Laravel app

Latest Version on Packagist Software License Build Status StyleCI Quality Score Total Downloads

This package provides a command that can check all links on your laravel app. By default, it will log all links that do not return a status code in the 200- or 300-range. There's also an option to mail broken links.

If you like this package, take a look at the other ones we have made.

Install

You can install the package via composer:

composer require spatie/laravel-link-checker

Next, you must install the service provider:

// config/app.php
'providers' => [
    ...
    Spatie\LinkChecker\LinkCheckerServiceProvider::class,
];

The service provider will automatically be registered.

You can optionally publish the config-file with:

php artisan vendor:publish --provider="Spatie\LinkChecker\LinkCheckerServiceProvider" --tag="config"

This is the contents of the published config file:

return [

    /*
     * The base url of your app. Leave this empty to use
     * the url configured in config/app.php
     */
    'url' => '',

    /*
     * The profile determines which links need to be checked.
     */
    'default_profile' => Spatie\LinkChecker\CheckAllLinks::class,

    /*
     * The reporter determines what needs to be done when the
     * the crawler has visited a link.
     */
    'default_reporter' => Spatie\LinkChecker\Reporters\LogBrokenLinks::class,

    /*
     * To speed up the checking process we'll fire off requests concurrently.
     * Here you can change the amount of concurrent requests.
     */
    'concurrency' => 10

    /*
     *  Here you can specify configuration regarding the used reporters
     */
    'reporters' => [

        'mail' => [

            /*
             * The `from` address to be used by the mail reporter.
             */
            'from_address' => '',

            /*
             * The `to` address to be used by the mail reporter.
             */
            'to_address' => '',

            /*
             * The subject line to be used by the mail reporter.
             */
            'subject' => '',
        ],

        /*
         * If you wish to exclude status codes from the reporters,
         * you can select the status codes that you wish to
         * exclude in the array below like: [200, 302]
         */
        'exclude_status_codes' => [],
    ],
];

Usage

You can start checking all links by issuing this command:

php artisan link-checker:run

Want to run the crawler on a different url? No problem!

php artisan link-checker:run --url=https://laravel.com

Schedule the command

To frequently check all links you can schedule the command:

// app/console/Kernel.php

protected function schedule(Schedule $schedule)
{
    ...
    $schedule->command('link-checker:run')->sundays()->daily();
}

Mail broken links

By default the package will log all broken links. If you want to have them mailed instead, just specify Spatie\LinkChecker\Reporters\MailBrokenLinks in the default_reporter option in the config file.

Creating your own crawl profile

A crawlprofile determines which links need to be crawled. By default Spatie\LinkChecker\CheckAllLinks is used, which will check all links it finds. This behaviour can be customized by specifying a class in the default_profile-option in the config file. The class must extend the abstract class Spatie\Crawler\CrawlProfile:

abstract class CrawlProfile
{
    /**
     * Determine if the given url should be crawled.
     *
     * @param \Psr\Http\Message\UriInterface $url
     *
     * @return bool
     */
    abstract public function shouldCrawl(UriInterface $url): bool;
}

Creating your own reporter

A reporter determines what should be done when a link is crawled and when the crawling process is finished. This package provides two reporters: Spatie\LinkChecker\Reporters\LogBrokenLinks and Spatie\LinkChecker\Reporters\MailBrokenLinks. You can create your own behaviour by making a class extend the abstract class Spatie\Crawler\CrawlObserver:

abstract class CrawlObserver
{
    /**
     * Called when the crawler will crawl the url.
     *
     * @param \Psr\Http\Message\UriInterface $url
     */
    public function willCrawl(UriInterface $url)
    {
    }

    /**
     * Called when the crawler has crawled the given url successfully.
     *
     * @param \Psr\Http\Message\UriInterface $url
     * @param \Psr\Http\Message\ResponseInterface $response
     * @param \Psr\Http\Message\UriInterface|null $foundOnUrl
     */
    abstract public function crawled(
        UriInterface $url,
        ResponseInterface $response,
        ?UriInterface $foundOnUrl = null
    );

    /**
     * Called when the crawler had a problem crawling the given url.
     *
     * @param \Psr\Http\Message\UriInterface $url
     * @param \GuzzleHttp\Exception\RequestException $requestException
     * @param \Psr\Http\Message\UriInterface|null $foundOnUrl
     */
    abstract public function crawlFailed(
        UriInterface $url,
        RequestException $requestException,
        ?UriInterface $foundOnUrl = null
    );

    /**
     * Called when the crawl has ended.
     */
    public function finishedCrawling()
    {
    }
}

To make it easier to create a reporter, you can extend Spatie\LinkChecker\Reporters\BaseReporter which provides many useful methods.

Changelog

Please see CHANGELOG for more information what has changed recently.

Testing

First, start the test server in a separate terminal session:

cd tests/server
./start_server.sh

With the server running you can execute the tests

composer test

Contributing

Please see CONTRIBUTING for details.

Security

If you discover any security related issues, please email freek@spatie.be instead of using the issue tracker.

Postcardware

You're free to use this package (it's MIT-licensed), but if it makes it to your production environment we highly appreciate you sending us a postcard from your hometown, mentioning which of our package(s) you are using.

Our address is: Spatie, Samberstraat 69D, 2060 Antwerp, Belgium.

All postcards are published on our website.

Credits

Support us

Spatie is a webdesign agency based in Antwerp, Belgium. You'll find an overview of all our open source projects on our website.

Does your business depend on our contributions? Reach out and support us on Patreon. All pledges will be dedicated to allocating workforce on maintenance and new awesome stuff.

License

The MIT License (MIT). Please see License File for more information.