phHartl / eu-judgement-analyse

Quantitative analysis of judgments of the European Court of Justice
MIT License
6 stars 0 forks source link

Bug: CorpusAnalysis get_g_grams not working #44

Closed thomfischer closed 3 years ago

thomfischer commented 3 years ago

get_n_grams does not work for CropusAnalysis objects. However, it does work for Analysis. Error:

line 276, in get_n_grams
    return list(textacy.extract.ngrams(self.corpus, n, filter_stops=filter_stop_words, filter_punct=True, filter_nums=filter_nums, min_freq=min_freq))
  File "textacy\extract.py", line 155, in ngrams
    ngrams_ = list(ngrams_)
  File "textacy\extract.py", line 141, in <genexpr>
    ngrams_ = (ngram for ngram in ngrams_ if not any(w.like_num for w in ngram))
  File "textacy\extract.py", line 139, in <genexpr>
    ngrams_ = (ngram for ngram in ngrams_ if not any(w.is_punct for w in ngram))
  File "textacy\extract.py", line 136, in <genexpr>
    ngram for ngram in ngrams_ if not ngram[0].is_stop and not ngram[-1].is_stop
  File "textacy\extract.py", line 133, in <genexpr>
    ngrams_ = (ngram for ngram in ngrams_ if not any(w.is_space for w in ngram))
  File "textacy\extract.py", line 133, in <genexpr>
AttributeError: 'spacy.tokens.doc.Doc' object has no attribute 'is_space'
phHartl commented 3 years ago

This is already known and has been fixed on Analysis Branch.