attardi / deepnl

Deep Learning for Natural Language Processing
GNU General Public License v3.0
458 stars 116 forks source link

dl-sentiwords throws error "IndexError: index 1 is out of bounds for axis 0 with size 1" #50

Open AzizCode92 opened 7 years ago

AzizCode92 commented 7 years ago

Hi all , When I use dl-sentiwords.py trained1.tsv --vocab words.txt --vectors vectors.txt I got this error

Saving vocabulary in words.txt Creating new network... ... with the following parameters:

    Input layer size: 550
    Hidden layer size: 200
    Output size: 2
    Starting training

Traceback (most recent call last): File "/usr/local/bin/dl-sentiwords.py", line 4, in import('pkg_resources').run_script('deepnl==1.3.18', 'dl-sentiwords.py') File "/usr/lib/python2.7/dist-packages/pkg_resources/init.py", line 742, in run_script self.require(requires)[0].run_script(script_name, ns) File "/usr/lib/python2.7/dist-packages/pkg_resources/init.py", line 1510, in run_script exec(script_code, namespace, namespace) File "/usr/local/lib/python2.7/dist-packages/deepnl-1.3.18-py2.7-linux-x86_64.egg/EGG-INFO/scripts/dl-sentiwords.py", line 218, in

File "deepnl/sentiwords.pyx", line 53, in itertrie File "deepnl/sentiwords.pyx", line 126, in deepnl.sentiwords.SentimentTrainer._train_pair_s File "deepnl/extractors.pyx", line 153, in deepnl.extractors.Converter.lookup File "deepnl/extractors.pyx", line 236, in deepnl.extractors.Extractor.getitem IndexError: index 1 is out of bounds for axis 0 with size 1

trained1.tsv is a file with the follwing format :

  <SID><tab><UID><tab><positive|negative|neutral|objective><tab><TWITTER_MESSAGE>

I have obtained the tsv file by transforming a huge dataset of tweets into a tsv and by making some transformation to the columns so that it suits the format mentioned above. For further details about my code here is the https://github.com/AzizCode92/text_mining_project/blob/master/csv_tsv.py

AzizCode92 commented 7 years ago

The issue is solved because I was working on a very huge TSV file , so I have tried to split it into parts and then the error is solved.

gabrer commented 6 years ago

Hi! Sorry, I have seen you have recently used this library..

I am training the Sentiment Specific embedding. At the end of each epoch, I have got a message like this:

23 epochs Examples: 7818897 Error: 326588146461.176880 Accuracy: 0.000000 23589 corrections skipped

The accuracy remains always zero, no matter the number of epochs. Is it ok? Did you get the same accuracy?

Thank you! :)

AzizCode92 commented 6 years ago

Hi ! I have seen Mr.Attardi's comment about the meaning of both accuracy and errors and he said

Don't worry about those numbers. You shoud get useable embeddings anyway.

gabrer commented 6 years ago

Thank you!! I have just found out the comment you have referred: https://github.com/attardi/deepnl/issues/32

waniss commented 6 years ago

@AzizCode92 Got the same issue, i'm using a big file, sorry but the dl-sentiwords.py need one file (according to the example) how did you manage to input several files ? Thanks.

1eclipse commented 4 years ago

Halo! Have you successfully installed deepnl? Where do you install it? windows, linux or mac? i have some problems in windows? Can you help me?