JasonKessler / scattertext

Beautiful visualizations of how language differs among document types.
Apache License 2.0
2.23k stars 289 forks source link

Calculating F-Score not working #51

Closed hdnl closed 4 years ago

hdnl commented 4 years ago

Was following the README tutorial and was on this step

print(list(corpus.get_scaled_f_scores_vs_background().index[:10]))

when I received an error:

---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
<ipython-input-262-2fb682de04c3> in <module>
----> 1 print(list(corpus.get_scaled_f_scores_vs_background().index[:10]))

/opt/anaconda3/envs/altlite/lib/python3.8/site-packages/scattertext/TermDocMatrixWithoutCategories.py in get_scaled_f_scores_vs_background(self, scaler_algo, beta)
    403         '''
    404         df = self.get_term_and_background_counts()
--> 405         df['Scaled f-score'] = ScaledFScore.get_scores_for_category(
    406             df['corpus'], df['background'], scaler_algo, beta
    407         )

/opt/anaconda3/envs/altlite/lib/python3.8/site-packages/scattertext/TermDocMatrixWithoutCategories.py in get_term_and_background_counts(self)
    144         corpus_unigram_freq = self._get_corpus_unigram_freq(corpus_freq_df)
    145         df = corpus_unigram_freq.join(background_df, how='outer').fillna(0)
--> 146         return df
    147 
    148     def get_term_count_df(self):

AttributeError: can't delete attribute

corpus was generated using the steps in the README tutorial. I'm unable to perform much functionality in scattertext, which is odd. Scattertext worked for me back in December, but now seems to break whenever F-scores are involved.

Your Environment

JasonKessler commented 4 years ago

Thanks for the bug report. Could you please include the exact code you used to generate corpus?

JasonKessler commented 4 years ago

Crap. This looks like it’s a pandas 1.0 issue. I’ll try to figure these out tonight and get a new version out.

JasonKessler commented 4 years ago

The latest version of Scattertext should resolve this.