So: corpus, ubuntu,
question comparison: vectorize tf idf the first message in each thread, and then find the one most similar, and pull entities out of it: ln ? apropos?
Maybe Man Pages are also initial vectors to compare, why not, they might have more relevant keywords...
So the ubuntu forms dataset is out there, waiting to be vectorized
https://radimrehurek.com/gensim/models/keyedvectors.html
Keyed vectors and cosign "most similar" look up.
So: corpus, ubuntu, question comparison: vectorize tf idf the first message in each thread, and then find the one most similar, and pull entities out of it: ln ? apropos?
Maybe Man Pages are also initial vectors to compare, why not, they might have more relevant keywords...
from program dictionary: ls /bin/