LuminosoInsight / sales-engineering-code

Code for sales engineering, particularly for code that will be given to customers
MIT License
0 stars 0 forks source link

A script for getting a Luminoso analysis of lots of documents #41

Closed rspeer closed 7 years ago

rspeer commented 7 years ago

I've needed this script many times and always worked around its failure to exist.

This script gets the Luminoso-analyzed version of a .jsons file of documents. Analyzed according to which project, you ask? Well, the project with a name you specify, but the key is that if that project doesn't exist, it creates it out of a sample of the documents you're analyzing.

The fact that it's a sample is important. The procedure isn't just to upload all the docs to a new project, then download all the docs from that project. We've seen that that leads to sadness when the documents are, say, 7 million Amazon reviews. And you don't need to show Luminoso all 7 million documents at once to get good enough terms and vectors.