R integration with Screaming Frog CLI
Screaming Frog SEO Spider is a website crawler for Windows, MacOS and Ubuntu designed for creating technical SEO audits. It can be used for free (for websites up to 500 URLs) or after purchasing the license.
Version 10.0 introduced Command Line Interface (CLI) that enables programmatic crawling and scheduling. This package is an R wrapper for the CLI.
This package requires version 10.0+ of Screaming Frog SEO Spider.
Screaming Frog SEO Spider can be downloaded here via a 'Download' button. If you happen to be installing it on a server (without GUI), remember to accept the EULA.
Please read official documentation: installation on Windows
Please read official documentation: installation on Mac OS
Please read official documentation: installation on Ubuntu
For more information about CLI, please read: link
Create a file in your .ScreamingFrogSEOSpider directory called licence.txt. Enter (copy and paste to avoid typos) your license username on the first line and license key on the second line and save the file.
Create or edit the file spider.config in your .ScreamingFrogSEOSpider directory. Locate and edit or add the following line:
eula.accepted=8
save the file and exit.
install.packages(devtools)
library(devtools)
devtools::install_github("Leszek-Sieminski/screamingFrogR")
library("screamingFrogR")
Please use sfr_setup_windows() function to setup Screaming Frog SEO Spider properly. To do this, you must provide a path to the directory of installation. Proper directory MUST contain 'ScreamingFrogSEOSpiderCli.exe' file to work properly, otherwise it won't work:
sfr_setup_windows(path = "C:/Program Files/Screaming Frog SEO Spider/")
# installation ----------------------------------------------------------------
install.packages("devtools")
devtools::install_github("Leszek-Sieminski/screamingFrogR")
library("screamingFrogR")
# setup (Windows only) --------------------------------------------------------
screamingFrogR::sfr_setup_windows(path = "C:/Program Files/Screaming Frog SEO Spider/")
# running a crawl -------------------------------------------------------------
screamingFrogR::sfr_crawl(
url = "https://julialang.org/learning/",
export_tabs = c("Internal:All", "External:All"),
timestamped_output = TRUE
)