pul-kit123 / spacy

MIT License
0 stars 0 forks source link

[CLOSED] TypeError: unsupported operand type(s) for * #2

Closed pul-kit123 closed 8 months ago

pul-kit123 commented 8 months ago

Issue by jasonmhead Wednesday Feb 11, 2015 at 21:25 GMT Originally opened as https://github.com/explosion/spaCy/issues/25


I'm getting

TypeError: unsupported operand type(s) for *: 'spacy.lexeme.Lexeme' and 'spacy.tokens.Token'

I'm running example code in iPython notebook. I suspect it has something to do with a multiplication by 0/null etc.(?) from the empty vector array, which should have values:

In [6]:
pleaded.repvec[:5]

Out[6]:
array([ 0.,  0.,  0.,  0.,  0.], dtype=float32)

full code and errors:

In [1]:
import spacy.en
from spacy.parts_of_speech import ADV
nlp = spacy.en.English()
In [2]:
# Load the pipeline, and call it with some text.

s = "'Give it back,' he pleaded abjectly, 'it’s mine.'"
s1 = s.decode('utf-8')

probs = [lex.prob for lex in nlp.vocab]
probs.sort()
is_adverb = lambda tok: tok.pos == ADV and tok.prob < probs[-1000]
tokens = nlp(s1)
print(''.join(tok.string.upper() if is_adverb(tok) else tok.string for tok in tokens))
'Give it back,' he pleaded ABJECTLY, 'it’s mine.'

In [3]:
b = 'back'
s2 = b.decode('utf-8')
nlp.vocab[s2].prob
Out[3]:
-7.403977394104004
In [4]:
pleaded = tokens[8]
In [5]:
pleaded.repvec.shape
Out[5]:
(300,)
In [6]:
pleaded.repvec[:5]
Out[6]:
array([ 0.,  0.,  0.,  0.,  0.], dtype=float32)
In [8]:
from numpy import dot
from numpy.linalg import norm
cosine = lambda v1, v2: dot(v1, v2) / (norm(v1), norm(v2))
words = [w for w in nlp.vocab if w.lower]
words.sort(key=lambda w: cosine(w, pleaded))
words.reverse()

#print('1-20', ', '.join(w.orth_ for w in words[0:20]))
#print('50-60', ', '.join(w.orth_ for w in words[50:60]))
#print('100-110', ', '.join(w.orth_ for w in words[100:110]))
#print('1000-1010', ', '.join(w.orth_ for w in words[1000:1010]))
#print('50000-50010', ', '.join(w.orth_ for w in words[50000:50010]))
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-8-3dfcfec488f6> in <module>()
      3 cosine = lambda v1, v2: dot(v1, v2) / (norm(v1), norm(v2))
      4 words = [w for w in nlp.vocab if w.lower]
----> 5 words.sort(key=lambda w: cosine(w, pleaded))
      6 words.reverse()
      7 

<ipython-input-8-3dfcfec488f6> in <lambda>(w)
      3 cosine = lambda v1, v2: dot(v1, v2) / (norm(v1), norm(v2))
      4 words = [w for w in nlp.vocab if w.lower]
----> 5 words.sort(key=lambda w: cosine(w, pleaded))
      6 words.reverse()
      7 

<ipython-input-8-3dfcfec488f6> in <lambda>(v1, v2)
      1 from numpy import dot
      2 from numpy.linalg import norm
----> 3 cosine = lambda v1, v2: dot(v1, v2) / (norm(v1), norm(v2))
      4 words = [w for w in nlp.vocab if w.lower]
      5 words.sort(key=lambda w: cosine(w, pleaded))

TypeError: unsupported operand type(s) for *: 'spacy.lexeme.Lexeme' and 'spacy.tokens.Token'
pul-kit123 commented 8 months ago

Comment by honnibal Wednesday Feb 11, 2015 at 23:12 GMT


This was a bug in the docs code. In the cosine function,

(norm(v1), norm(v2))

Should be:

(norm(v1) * norm(v2))

With the comma, the function returns a vector instead of a float.

I've updated the docs.

pul-kit123 commented 8 months ago

Comment by jasonmhead Thursday Feb 12, 2015 at 00:21 GMT


hm, it still doesn't like my code....

----> 3 cosine = lambda v1, v2: dot(v1, v2) / (norm(v1) * norm(v2))
      4 words = [w for w in nlp.vocab if w.lower]
      5 words.sort(key=lambda w: cosine(w, pleaded))

TypeError: unsupported operand type(s) for *: 'spacy.lexeme.Lexeme' and 'spacy.tokens.Token'

and perhaps on a separate error, would be good to figure out why I have an empty vector array farther up:

In [4]:
pleaded = tokens[8]
In [5]:
pleaded.repvec.shape
Out[5]:
(300,)
In [6]:
pleaded.repvec[:5]
Out[6]:
array([ 0.,  0.,  0.,  0.,  0.], dtype=float32)
pul-kit123 commented 8 months ago

Comment by honnibal Thursday Feb 12, 2015 at 01:02 GMT


Try changing

4 words = [w for w in nlp.vocab if w.lower]

To

4 words = [w for w in nlp.vocab if w.has_repvec]
pul-kit123 commented 8 months ago

Comment by jasonmhead Thursday Feb 12, 2015 at 01:52 GMT


Changed and re-ran, but the error is thrown before that point:

<ipython-input-13-61632dc9dcea> in <lambda>(v1, v2)
      1 from numpy import dot
      2 from numpy.linalg import norm
----> 3 cosine = lambda v1, v2: dot(v1, v2) / (norm(v1) * norm(v2))
      4 words = [w for w in nlp.vocab if w.has_repvec]
      5 words.sort(key=lambda w: cosine(w, pleaded))

TypeError: unsupported operand type(s) for *: 'spacy.lexeme.Lexeme' and 'spacy.tokens.Token'

Env Details if it helps: Mac OsX 10.9.5 Python 2.7.6 :: Anaconda 2.0.0 (x86_64)

pul-kit123 commented 8 months ago

Comment by honnibal Thursday Feb 12, 2015 at 02:03 GMT


Well, the error is thrown when cosine is invoked --- i.e., during the sort function (cosine is used as the comparison function).

words.sort(key=lambda w: cosine(w.repvec, pleaded.repvec))

(Btw: the larger point is, why have I had so many bugs in the documentation code --- why can't I just use doctest, so that the code is run automatically, during the build?

The problem is that I can't make doctest work with unicode in the strings, and I want to keep the smart-quotes in the example, to illustrate that spaCy works with unicode source text. So, it's hard to ensure that the documentation examples have no errors.)

pul-kit123 commented 8 months ago

Comment by lock[bot] Wednesday May 09, 2018 at 18:32 GMT


This thread has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs.