gregjacobs / Autolinker.js

Utility to Automatically Link URLs, Email Addresses, Phone Numbers, Twitter handles, and Hashtags in a given block of text/HTML
MIT License
1.48k stars 239 forks source link

Autolinker.js

Because I had so much trouble finding a good auto-linking implementation out in the wild, I decided to roll my own. It seemed that everything I found out there was either an implementation that didn't cover every case, or was just limited in one way or another.

So, this utility attempts to handle everything. It:

Hope that this utility helps you as well!

Full API Docs: http://gregjacobs.github.io/Autolinker.js/api/
Live Example: http://gregjacobs.github.io/Autolinker.js/examples/live-example/

v4.0 released September 2022

See Upgrading from v3.x -> v4.x (Breaking Changes) at the bottom of this readme.

Installation

Installing with the npm package manager:

npm install autolinker --save

Installing with the Yarn package manager:

yarn add autolinker

Installing with the Bower package manager:

bower install Autolinker.js --save

Direct download

Simply clone this repository or download a zip of the project, and link to either dist/Autolinker.js or dist/Autolinker.min.js with a script tag.

Importing Autolinker

ES6/TypeScript/Webpack:

import Autolinker from 'autolinker';

Node.js:

const Autolinker = require('autolinker');
// note: npm wants an all-lowercase package name, but the utility is a class and
// should be aliased with a capital letter

Browser

<!-- 'Autolinker.js' or 'Autolinker.min.js' - non-minified is better for 
     debugging, minified is better for users' download time -->
<script src="https://github.com/gregjacobs/Autolinker.js/raw/master/path/to/autolinker/dist/Autolinker.min.js"></script>

Usage

Using the static link() method:

const linkedText = Autolinker.link(textToAutolink[, options]);

Using as a class:

const autolinker = new Autolinker([ options ]);

const linkedText = autolinker.link(textToAutoLink);

Note: if using the same options to autolink multiple pieces of html/text, it is slightly more efficient to create a single Autolinker instance, and run the link() method repeatedly (i.e. use the "class" form above).

Examples:

const linkedText = Autolinker.link("Check out google.com");
// Produces: "Check out <a href="http://google.com" target="_blank" rel="noopener noreferrer">google.com</a>"

const linkedText = Autolinker.link("Check out google.com", { 
    newWindow: false 
});
// Produces: "Check out <a href="http://google.com">google.com</a>"

Options

The following are the options which may be specified for linking. These are specified by providing an Object as the second parameter to Autolinker.link(). These include:

For example, if you wanted to disable links from opening in new windows, you could do:

const linkedText = Autolinker.link("Check out google.com", { 
    newWindow: false 
});
// Produces: "Check out <a href="http://google.com">google.com</a>"

And if you wanted to truncate the length of URLs (while also not opening in a new window), you could do:

const linkedText = Autolinker.link("http://www.yahoo.com/some/long/path/to/a/file", { 
    truncate: 25, 
    newWindow: false 
});
// Produces: "<a href="http://www.yahoo.com/some/long/path/to/a/file">yahoo.com/some/long/pat..</a>"

More Examples

One could update an entire DOM element that has unlinked text to auto-link them as such:

const myTextEl = document.getElementById('text');
myTextEl.innerHTML = Autolinker.link(myTextEl.innerHTML);

Using the same pre-configured Autolinker instance in multiple locations of a codebase (usually by dependency injection):

const autolinker = new Autolinker({ newWindow: false, truncate: 25 });

//...

autolinker.link("Check out http://www.yahoo.com/some/long/path/to/a/file");
// Produces: "Check out <a href="http://www.yahoo.com/some/long/path/to/a/file">yahoo.com/some/long/pat..</a>"

//...

autolinker.link( "Go to www.google.com" );
// Produces: "Go to <a href="http://www.google.com">google.com</a>"

Retrieving the List of Matches

If you're just interested in retrieving the list of Matches without producing a transformed string, you can use the parse() method.

For example:

const matches = Autolinker.parse("Hello google.com, I am asdf@asdf.com", {
    urls: true,
    email: true
});

console.log(matches.length);         // 2
console.log(matches[0].type);        // 'url'
console.log(matches[0].getUrl());    // 'google.com'
console.log(matches[1].type);        // 'email'
console.log(matches[1].getEmail());  // 'asdf@asdf.com'

Custom Replacement Function

A custom replacement function (replaceFn) may be provided to replace url/email/phone/mention/hashtag matches on an individual basis, based on the return from this function.

Full example, for purposes of documenting the API:

const input = "...";  // string with URLs, Email Addresses, Mentions (Twitter, Instagram), and Hashtags

const linkedText = Autolinker.link(input, {
    replaceFn : function(match) {
        console.log("href = ", match.getAnchorHref());
        console.log("text = ", match.getAnchorText());

        switch(match.type) {
            case 'url':
                console.log("url: ", match.getUrl());

                return true;  // let Autolinker perform its normal anchor tag replacement

            case 'email':
                const email = match.getEmail();
                console.log("email: ", email);

                if(email === "my@own.address") {
                    return false;  // don't auto-link this particular email address; leave as-is
                } else {
                    return;  // no return value will have Autolinker perform its normal anchor tag replacement (same as returning `true`)
                }

            case 'phone':
                console.log("Phone Number: ", match.getPhoneNumber());

                return '<a href="http://newplace.to.link.phone.numbers.to/">' + match.getPhoneNumber() + '</a>';

            case 'mention':
                console.log("Mention: ", match.getMention());
                console.log("Mention Service Name: ", match.getServiceName());

                return '<a href="http://newplace.to.link.mention.handles.to/">' + match.getMention() + '</a>';

            case 'hashtag':
                console.log("Hashtag: ", match.getHashtag());

                return '<a href="http://newplace.to.link.hashtag.handles.to/">' + match.getHashtag() + '</a>';
        }
    }
} );

Modifying the default generated anchor tag

const input = "...";  // string with URLs, Email Addresses, Mentions (Twitter, Instagram), and Hashtags

const linkedText = Autolinker.link( input, {
    replaceFn : function( match ) {
        console.log("href = ", match.getAnchorHref());
        console.log("text = ", match.getAnchorText());

        const tag = match.buildTag();       // returns an `Autolinker.HtmlTag` instance for an <a> tag
        tag.setAttr('rel', 'nofollow');   // adds a 'rel' attribute
        tag.addClass('external-link');    // adds a CSS class
        tag.setInnerHtml('Click here!');  // sets the inner html for the anchor tag

        return tag;
    }
} );

The replaceFn is provided one argument:

  1. An Autolinker.match.Match object which details the match that is to be replaced.

A replacement of the match is made based on the return value of the function. The following return values may be provided:

  1. No return value (undefined), or true (boolean): Delegate back to Autolinker to replace the match as it normally would.
  2. false (boolean): Do not replace the current match at all - leave as-is.
  3. Any string: If a string is returned from the function, the string will be used directly as the replacement HTML for the match.
  4. An Autolinker.HtmlTag instance, which can be used to build/modify an HTML tag before writing out its HTML text.

Full API Docs

The full API docs for Autolinker may be referenced at: http://gregjacobs.github.io/Autolinker.js/api/

Live Example

http://gregjacobs.github.io/Autolinker.js/examples/

Upgrading from v3.x -> v4.x (Breaking Changes)

  1. Internet Explorer support has been removed since its official demise in June 2022.
  2. The urls.wwwMatches config has been removed. A www. prefix is now treated like any other subdomain of a top level domain (TLD) match (such as 'subdomain.google.com').
  3. Match.getType() should be replaced with Match.type. This allows for TypeScript type narrowing of Match objects returned by the parse() method or inside the replaceFn.
  4. The Matcher classes have been removed in favor of a single finite state machine parser, greatly improving the performance of Autolinker but removing some of the customizability of the old regular expressions. Will address this customizability in a future release.
  5. Autolinker.AnchorTagBuilder, Autolinker.HtmlTag, and Autolinker.match.* references have been removed. These shouldn't be needed as public APIs, but please raise a GitHub issue if these are for some reason needed.

Upgrading from v2.x -> v3.x (Breaking Changes)

  1. If you are still on v1.x, first follow the instructions in the Upgrading from v1.x -> v2.x section below.
  2. The HtmlParser class has been removed in favor of an internal parseHtml() function which replaces the old regexp-based implementation with a state machine parser that is guaranteed to run in linear time. If you were using the HtmlParser class directly, I recommend switching to htmlparser2, which implements the HTML semantics better. The internal parseHtml() function that Autolinker now uses is fairly geared towards Autolinker's purposes, and may not be useful in a general HTML parsing sense.

Upgrading from v1.x -> v2.x (Breaking Changes)

  1. If you are still on v0.x, first follow the instructions in the Upgrading from v0.x -> v1.x section below.

  2. The codebase has been converted to TypeScript, and uses ES6 exports. You can now use the import statement to pull in the Autolinker class and related entities such as Match:

    // ES6/TypeScript/Webpack
    import Autolinker, { Match } from 'autolinker';

    The require() interface is still supported as well for Node.js:

    // Node.js
    const Autolinker = require('autolinker');
  3. You will no longer need the @types/autolinker package as this package now exports its own types

  4. You will no longer be able to override the regular expressions in the Matcher classes by assigning to the prototype (for instance, something like PhoneMatcher.prototype.regex = ...). This is due to how TypeScript creates properties for class instances in the constructor rather than on prototypes.

    The idea of providing your own regular expression for these classes is a brittle notion anyway, as the Matcher classes rely on capturing groups in the RegExp being in the right place, or even multiple capturing groups for the same piece of information to support a different format. These capturing groups and associated code are subject to change as the regular expression needs to be updated, and will not involve a major version release of Autolinker.

    In the future you will be able to override the default Matcher classes entirely to provide your own implementation, but please raise an issue (or +1 an issue) if you think the library should support a currently-unsupported format.

Upgrading from v0.x -> v1.x (Breaking Changes)

  1. twitter option removed, replaced with mention (which accepts 'twitter', 'instagram' and 'soundcloud' values)
  2. Matching mentions (previously the twitter option) now defaults to being turned off. Previously, Twitter handle matching was on by default.
  3. replaceFn option now called with just one argument: the Match object (previously was called with two arguments: autolinker and match)
  4. (Used inside the replaceFn) TwitterMatch replaced with MentionMatch, and MentionMatch.getType() now returns 'mention' instead of 'twitter'
  5. (Used inside the replaceFn) TwitterMatch.getTwitterHandle() -> MentionMatch.getMention()

Developing / Contributing

Pull requests definitely welcome. To setup the project, make sure you have Node.js installed. Then open up a command prompt and type the following:

cd Autolinker.js     # where you cloned the project
npm install

To run the tests:

npm run test

Building the Project Fully

For this you will need Ruby installed (note: Ruby comes pre-installed on MacOS), with the JSDuck gem.

See https://github.com/senchalabs/jsduck#getting-it for installation instructions on Windows/Mac/Linux

JSDuck is used to build the project's API/documentation site. See [Documentation Generator Notes](#Documentation Generator Notes) for more info.

Running the Live Example Page Locally

Run:

yarn serve

Then open your browser to: http://localhost:8080/docs/examples/index.html

You should be able to make a change to source files, and refresh the page to see the changes.

Note: If anyone wants to submit a PR converting gulp watch to webpack with the live development server, that would be much appreciated :)

Documentation Generator Notes

This project uses JSDuck for its documentation generation, which produces the page at http://gregjacobs.github.io/Autolinker.js.

Unfortunately, JSDuck is a very old project that is no longer maintained. As such, it doesn't support TypeScript or anything from ES6 (the class keyword, arrow functions, etc). However, I have yet to find a better documentation generator that creates such a useful API site. (Suggestions for a new one are welcome though - please raise an issue.)

Since ES6 is not supported, we must generate the documentation from the ES5 output. As such, a few precautions must be taken care of to make sure the documentation comes out right:

  1. @cfg documentation tags must exist above a class property that has a default value, or else it won't end up in the ES5 output. For example:

    // Will correctly end up in the ES5 output
    
    /**
    * @cfg {String} title
    */
    readonly title: string = '';
    
    // Will *not* end up in ES5 output, and thus, won't end up in the generated
    // documentation
    
    /**
    * @cfg {String} title
    */
    readonly title: string;
  2. The @constructor tag must be replaced with @method constructor

Changelog

See Releases