Apache PDFBox for reading PDF documents
Apache POI for reading Microsoft Office documents
Apache Commons Math, Collections, File Upload, IO, Compress
CyberNeko HTML Parser for reading (less than valid) HTML
JAMA: Java Matrix Package for principal component and correspondence analysis in ScatterPlot
MAchine Learning for LanguagE Toolkit (MALLET), especially for topic clustering
Oracle Berkeley DB Java Edition for data storage
Stanford Core Natural Language Processing, especially for named entity recognition in RezoViz
XStream used to produce XML or JSON results
Google Closure Compiler to compress Javascript files
jQuery another Javascript framework used by some tools
Sencha EXT JS the main Javascript framework used
As far as I understand, Preston can be used in combination with mentioned tools to reliably access data of known provenance (aka data and their lineage).
@debpaul pointed out https://voyant-tools.org as a tool that the humanities use to analyze texts.
from https://voyant-tools.org/docs/#!/guide/about-section-software-libraries - the web tool uses the following (java) libraries: