IndicoDataSolutions / Passage

A little library for text analysis with RNNs.
MIT License
530 stars 134 forks source link

TypeError: not all arguments converted during string formatting #15

Closed ranti-iitg closed 9 years ago

ranti-iitg commented 9 years ago

Hi, getting this error.

Traceback (most recent call last): File "dummy.py", line 15, in model = RNN(layers=layers, cost='BinaryCrossEntropy') File "c:\users\user1\documents\github\passage\passage\models.py", line 44, in init self.y_tr = self.layers[-1].output(dropout_active=True) File "c:\users\user1\documents\github\passage\passage\layers.py", line 275, in output X = self.l_in.output(dropout_active=dropout_active) File "c:\users\user1\documents\github\passage\passage\layers.py", line 239, in output outputs_info=[repeat(self.h0, x_h.shape[1], axis=0)], File "C:\Users\user1\AppData\Local\Enthought\Canopy\User\lib\site-packages\theano\tensor\extra_ops.py", line 360, in repeat return RepeatOp(axis=axis)(x, repeats) File "C:\Users\user1\AppData\Local\Enthought\Canopy\User\lib\site-packages\theano\gof\op.py", line 399, in call node = self.make_node(_inputs, *_kwargs) File "C:\Users\user1\AppData\Local\Enthought\Canopy\User\lib\site-packages\theano\tensor\extra_ops.py", line 259, in make_node % numpy_unsupported_dtypes), repeats.dtype) TypeError: not all arguments converted during string formatting

Contents of file dummy.py

from passage.preprocessing import Tokenizer from passage.layers import Embedding, GatedRecurrent, Dense from passage.models import RNN from passage.utils import save, load train_text= ['hello world', 'foo bar'] train_labels= [0, 1] tokenizer = Tokenizer() train_tokens = tokenizer.fit_transform(train_text)

layers = [ Embedding(size=128, n_features=tokenizer.n_features), GatedRecurrent(size=128), Dense(size=1, activation='sigmoid') ]

model = RNN(layers=layers, cost='BinaryCrossEntropy') model.fit(train_tokens, train_labels)

model.predict(tokenizer.transform(test_text)) save(model, 'save_test.pkl') model = load('save_test.pkl')

ranti-iitg commented 9 years ago

pip freeze dump

-registry-path==1.0 appinst==2.1.2 apptools==4.2.1 backports.ssl-match-hostname==3.2a3 BeautifulSoup==3.2.1 beautifulsoup4==4.3.2 Canopy==1.5.1.dev7367 canopydebugger==0.1.1.dev0 canopydebugger-addon==0.1.1.dev0 casuarius==1.1 chaco==4.5.0 cloud==2.4.6 configobj==5.0.6 coverage==3.7.1 Cython==0.22 decorator==3.4.0 docopt==0.6.2 docutils==0.12 enable==4.4.1 enaml==0.9.8 enclosure==0.2 encore==0.6.0 enstaller==4.8.1 envisage==4.4.0 esky==0.9.8.dev0 etsproxy==0.1.2 Examples==7.3 faulthandler==2.3 feedparser==5.1.3 futures==2.2.0 gdbn==0.1 gensim==0.8.3 gnumpy==0.2 grits-client==0.1 html5lib==0.999 idle==2.7.3 ipython==3.0.0 Jinja2==2.7.3 joblib==0.8.4 jsonpickle==0.4.0 jsonschema==2.4.0 kernmagic==0.2.0 keyring==4.0 libpython==1.3.0 lxml==3.4.2 MarkupSafe==0.23 matplotlib==1.4.3 mechanize==0.2.5 mingw==4.8.1 mistune==0.5 MKL==10.3 mock==1.0.1 nltk==3.0.1 nolearn==0.4 nose==1.3.4 numpy==1.8.1 pandas==0.15.2 -e passage==0.2.4 pbr==0.10.7 PIL==1.1.7 ply==3.4 psutil==2.1.1 ptvs==2.0.0 pyaudio==0.2.4 pycrypto==2.6.1 pyface==4.4.0 pyflakes==0.4.5.dev80 pyglet==1.1.4 Pygments==2.0.2 -e pylearn2==0.1.dev0 pyparsing==2.0.3 pyreadline==2.0.0 Pyro4==4.11 PySide==1.2.2 python-dateutil==2.2.0 PythonDoc==2.7.3 pytz==2014.9.0 pywin32==219.0.0 PyYAML==3.11 pyzmq==14.5.0 requests==2.5.3 scikit-learn==0.15.2 scikits.learn==0.8 scipy==0.15.1 six==1.9.0 SQLAlchemy==0.9.8 sqlalchemy-migrate==0.9.4 sqlparse==0.1.14 ssl-match-hostname==3.4.0.2 supplement===0.5dev.dev202 sympy==0.7.6 Tempita==0.5.2 textblob==0.9.0 Theano==0.6.0 tornado==4.1 traits==4.5.0 traitsui==4.4.0 wxPython==2.8.10.1

ranti-iitg commented 9 years ago

updating theano to .7 will solve the issue, issue is repeat takes python_int_bitwidth, but it should take local_bitwidth since they are the same on Unix systems, but different on Windows, hence in .7 update it was corrected