LeaVerou / brep

Write batch find & replace scripts that transform files with a simple human-readable syntax
13 stars 0 forks source link

brep (Batch REPlace) npm

For versions < 0.0.8 see bafr

Online playground

Ever written some complex find & replace operations in a text editor, sed or whatever, and wished you could save them somewhere and re-run them in the future? This is exactly what brep (Batch Replace) does!

You write a brep script (see syntax below), and then you apply it from the command-line like:

brep myscript.brep.toml src/**/*.html

This will apply the script myscript.brep.toml to all HTML files in the src folder and its subfolders. You don’t need to specify the file paths multiple times if they don’t change, you can include them in your script as defaults (and still override them if needed).

Contents

  1. Installation
  2. Syntax
  3. Syntax reference
  4. CLI
  5. JS API
  6. Future plans

Installation

You will need to have Node.js installed. Then, to install brep, run:

npm install -g brep

Syntax

There are three main syntaxes, each more appropriate for different use cases:

  1. TOML when your strings are multiline or have weird characters and you want their boundaries to be very explicit
  2. YAML when you want a more concise syntax for simple replacements
  3. JSON is also supported. It’s not recommended for writing by hand but can be convenient as the output from other tools.

Of all three, YAML is the most concise and human-readable, but can behave unpredictably or be confusing in edge cases (special symbols, multiline strings). TOML supports a very precise syntax, and multiline strings, but can look rather awkward. Lastly, JSON is very fragile and verbose, but has the best compatibility with other tools.

TOML YAML JSON
Readability ★★★☆☆ ★★★★☆ ★★☆☆☆
Conciseness ★★☆☆☆ ★★★★★ ★★☆☆☆
Robustness[^robustness] ★★★☆☆ ★★☆☆☆ ★☆☆☆☆
Compatibility ★☆☆☆☆ ★★☆☆☆ ★★★★★
Supports comments? 🚫
Multiline strings? ✅ but very complex 🚫[^crlf]

[^robustness]: Opposite of error-proneness. How hard is it to make mistakes? This includes both syntax errors or writing syntax that behaves differently than what you expect.

[^crlf]: This refers to support for strings that can spread across multiple lines in your brep script. You can always include line breaks by using \n to represent them.

The docs below will show TOML and YAML, and it’s up to you what you prefer.

Replacing text with different text

The most basic brep script is a single replacement consisting of a single static from declaration and a single to replacement.

As an example, here is how you can replace all instances of <br> with a line break character:

from = "<br>"
to = "\n"
from: <br>
to: "\n"
{
    "from": "<br>",
    "to": "\n"
}

Note that the YAML syntax allows you to not quote strings in many cases, which can be quite convenient but can also create errors if you’re not careful.

Multiline strings

This also works, and shows how you can do multiline strings:

from = "<br>"
to = """
"""
from: <br>
to: >+

I do not recommend using YAML for multiline strings but you can read more about the many different ways to do it here.

Regular expressions

Replacing fixed strings with other fixed strings is useful, but not very powerful. The real power of brep comes from its ability to use regular expressions. For example, here is how you’d strip all <blink> tags:

regexp = true
from = "<blink>([\S\s]+?)</blink>"
to = "<span class=blink>$1</span>" # $1 will match the content of the tag
regexp: true
from: <blink>([\S\s]+?)</blink>
to: <span class=blink>$1</span> # $1 will match the content of the tag
{
    "regexp": true,
    "from": "<blink>([\\S\\s]+?)</blink>",
    "to": "<span class=blink>$1</span>"
}

brep uses the JS dialect for regular expressions (cheatsheet) with the following flags:

Multiple find & replace operations

So far our script has only been specifying a single find & replace operation. That’s not very powerful. The real power of Brep is that a single script can specify multiple find & replace operations, executed in order, with each operating on the result of the previous one. We will refer to each of these as a replacement in the rest of the docs.

Multiple replacements in TOML

To specify multiple find & replace operations, you simply add [[ replace ]] sections:

[[ replace ]]
from = "<blink>"
to = "<span class=blink>"

[[ replace ]]
from = "</blink>"
to = "</span>"

You can also do it like this:

replace = [
    { from = "<blink>", to = "<span class=blink>" },
    { from = "</blink>", to = "</span>" },
]

Multiple replacements in YAML

To specify multiple declarations, you need to enclose them in { }:

replace:
- { from: <blink>, to: '<span class="blink">' }
- { from: </blink>, to: "</span>" }

Multiple replacements in JSON

"replace": [
    {"from": "<blink>", "to": "<span class=blink>"},
    {"from": "</blink>", "to": "</span>"}
]

Nested replacements

In some cases it’s more convenient to match a larger part of the text and then do more specific replacements inside just those matches. In a way, that is similar to a text editor’s "find in selection" feature, except on steroids.

# Match sequences of single-line JS comments
regexp: true
from: "(^//[^\n\r]*$)+"
to: "/* $& */" # Convert to block comments
replace:
# Strip comment character
- { regexp: true, from: "^//", to: "" }
- { regexp: true, from: "^/* //", to: "/*" }

If you specify a to, it will be applied before the child replacements.

Refer to the matched string

You can use $& to refer to the matched string (even when not in regexp mode). For example, to wrap every instance of "brep" with an <abbr> tag you can do:

from = "brep"
to = '<abbr title="Batch REPlace">$&</abbr>'
from: brep
to: '<abbr title="Batch REPlace">$&</abbr>'

Beyond $& there is a bunch of other special replacements, all starting with a dollar sign ($). To disable these special replacements, use literal = true / literal: true.

Append/prepend

While $& can be convenient, it’s also a little cryptic. To make it easier to append or prepend matches with a string, brep also supports before, after, and insert properties.

For example, this will insert "Bar" before every instance of "Foo":

before = "Foo"
insert = "Bar"
before: Foo
insert: Bar

after is also supported and works as you might expect.

[!NOTE] insert is literally just a an alias of to, it just reads nicer in these cases.

You can also combine these with from to add additional criteria. For example this script:

from = "brep"
after = "using"
to = "awesome brep"

Will convert "I am using brep" to "I am using awesome brep".

[from, to] shortcut syntax for many simple replacements

There are many cases where you want to make many replacements, all with the same settings (specified on their parent) and just different from/to values. Brep supports a shortcut for this.

Instead of declarations, you can specify from/to pairs directly by enclosing them in brackets, separated by a comma. This can be combined with regular replacements, though far more easily in YAML:

replace:
- [foo, bar]
- [baz, quux]
- {from: yolo, to: hello} # regular replacement

In TOML, it cannot be combined with regular [[ replace ]] blocks, so all replacements need to be specified in a different way:

replace = [
    ["foo", "bar"],
    ["baz", "quux"],
    # cannot be combined with [[ replace ]] blocks
    { from = "yolo", to = "hello", ignore_case = true },
]

Syntax reference

Replacements

Key Type Default Description
from String or array of strings (Mandatory) The string to search for. If an array of strings, they are considered alternatives.
to String (matched string) The string to replace the from string with.
before String or array of strings - Match only strings before this one. Will be interpreted as a regular expression in regexp mode.
after String or array of strings - Match only strings after this one. Will be interpreted as a regular expression in regexp mode.
regexp Boolean false Whether the from field should be treated as a regular expression.
ignore_case Boolean false Set to true to make the search should case-insensitive.
whole_word Boolean false Match only matches either beginning/ending in non-word characters or preceded/followed by non-word characters. Unicode aware.
recursive Boolean false Whether the replacement should be run recursively on its own output until it stops changing the output.
files String or array of strings - Partial paths to filter against. This is an additional filter over the files being processed, to apply specific replacements only to some of the files.

Global settings

Key Type Default Description
files String or array of strings - A glob pattern to match files to process.
suffix String "" Instead of overwriting the original file, append this suffix to its filename
extension String - Instead of overwriting the original file, change its extension to this value. Can start with a . but doesn’t need to.
path String - Allows the new file to be in a different directory. Both absolute and relative paths are supported. If relative, it's resolved based on the original file's location. For example, .. will write a file one directory level up.

CLI

To apply a brep script to the files specified in the script, simply run:

brep script.brep.toml

Where script.brep.toml is your brep script (and could be a .yaml or .json file).

To override the files specified in the script, specify them after the script file name, like so:

brep script.brep.toml src/*.md

The syntax (TOML, YAML, JSON) is inferred from the file extension. To override that (or to use an arbitrary file extension) you can use --format:

brep script.brep --format=toml

[!NOTE] You can name your script however you want, however ending in .brep.ext can more clearly communicate that this is a brep script.

Supported flags

Any root-level setting can also be specified as a flag, e.g. --suffix=-edited or --extension=txt.

JS API

You can access all of brep’s functionality via JS, and some of it even works client-side, in the browser!

Accessing the CLI via JS (Node.js only)

There is also the JS version of the CLI you can access as:

import brep from "brep/cli";
await brep("script.yaml");
// Do stuff after script runs

Replacer class (Browser-compatible)

This is the core of brep and takes care of applying the replacements on strings of text.

import { Replacer } from "brep/replacer";

or, in the browser:

import { Replacer } from "node_modules/brep/src/replacer.js";

Instance methods:

Brep class (Node.js-only)

This takes care of reading script files, parsing them, creating Replacer instances, and applying the brep script to files.

import Brep from "brep";

Instance methods:

Future plans

I/O

CLI