Crawls a Multi-Page Application into a zip file. Serve the Multi-Page Application from the zip file. A MPA archiver. Could be used as a Site Generator.
npm install -g mpa-archive
mpa http://example.net
Will crawl the url recursively and save it in example.net.zip
. Once
done, it will display a report and can serve the files from the zip.
The original idea is to save the HTML generated by JavaScript, to
allow search engines index the content of a website that uses
JavaScript. This has the undesired result that some applications,
specially SPA may not work. To save the original HTML instead of the
rendered HTML you can use the --spa
option, which will save the
original HTML and avoid re-writing links.
mpa https://example.net --spa
mpa
Will create a server for each zip file on the current directory. binds
to 0.0.0.0
(it can be opened in localhost
) the port
is random
but seeded to the zip file name, so it remains the same.
http://example.net
with cpu count / 2
threadsurls.txt
, sitemap.txt
and sitemap.xml
as a seed pointfetch
external resourcesmpa/sitemap.txt
and mpa/sitemap.xml
fetch
requestsurl
has been opened in a tab for crawlinga
link on page has been focused (for when js modules are
preloaded/loaded on focus)document.documentElement.outerHTML
has been savedfetch
request has been saved, or the response
body of a request made by the tab has been savedfetch
request has been firedfetch
request)