For a given aspect and industry sector (GICS), 1) get company-specific sentence embeddings z_s from ABAE step, and 2) average them to obtain a "company embedding"
Calculate cosine similarities among all company embeddings
To understand which words contributed the most to the makeup of each company embedding, 1) make vocabulary distribution with attention weights, 2) make vocabulary distribution with frequency, 3) normalize and add them up (Need to validate)