return_data parameter for produce_scattertext_explorer

Thanks for a great tool!

I noticed an output that I wanted to bring to your attention. Following along with your first example if you add return_data=True parameter to produce_scattertext_explorer:

scatter_text_dict = st.produce_scattertext_explorer(
    corpus,
    category='democrat',
    category_name='Democratic',
    not_category_name='Republican',
    metadata=convention_df['speaker'],
        return_data=True
)

scatter_text_dict is created with keys ['info','data','docs']

When examining scatter_text_dict['info'] the value is:

{'categories': ['democrat', 'republican'],
 'category_internal_name': 'democrat',
 'category_name': 'Democratic',
 'category_terms': ['government',
  'business',
  'better',
  'story',
  'paul',
  'success',
  'administration',
  'unemployment',
  'we need',
  'do better'],
 'extra_category_internal_names': [],
 'extra_category_name': 'Extra',
 'neutral_category_internal_names': [],
 'neutral_category_name': 'Neutral',
 'not_category_internal_names': ['republican'],
 'not_category_name': 'Republican',
 'not_category_terms': ['government',
  'business',
  'better',
  'story',
  'paul',
  'success',
  'administration',
  'unemployment',
  'we need',
  'do better']}

You can see the category_terms and non_category_terms values are the same. Both lists are the list of top terms for non_category_terms.

There is similar behavior for other examples (e.g., empath_features, etc.)

Your Environment

Operating System: Ubuntu 18.04.3 LTS
Python Version Used: 3.6.9
Scattertext Version Used: 0.0.2.65
Environment Information: Google Colab

JasonKessler / scattertext

return_data parameter for produce_scattertext_explorer #63

Your Environment