chop-dbhi / twitter-adr-blstm

A model for finding mentions of adverse drug reactions in Twitter posts
GNU General Public License v3.0
33 stars 16 forks source link

None type not iterable: An error occurred #2

Closed o0windseed0o closed 6 years ago

o0windseed0o commented 6 years ago

When running the last step pre_train_test.sh, there is an error:

Loading embeddings... Traceback (most recent call last): File "./adr_label.py", line 437, in train_set, valid_set, test_set, dic = prep.load_adefull(opts.picklefile) TypeError: 'NoneType' object is not iterable

Any clues why this happened? It seems that I cannot load the picklefile, but how can I have this file?

acocos commented 6 years ago

Can you provide the full stack trace please?

On Tue, Jul 24, 2018 at 11:39 AM, xiangyang notifications@github.com wrote:

When running the last step pre_train_test.sh, there is an error:

Loading embeddings... Traceback (most recent call last): File "./adr_label.py", line 437, in train_set, valid_set, test_set, dic = prep.load_adefull(opts.picklefile) TypeError: 'NoneType' object is not iterable

Any clues why this happened?

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/chop-dbhi/twitter-adr-blstm/issues/2, or mute the thread https://github.com/notifications/unsubscribe-auth/ABv2ZOMaeN_e8WQv6L9fFKmSPC0v8CsIks5uJz-6gaJpZM4Vc_D- .

-- Anne Cocos PhD student Department of Computer and Information Science University of Pennsylvania http://www.cis.upenn.edu/~acocos

o0windseed0o commented 6 years ago

@acocos thanks for your quick reply! See below: Traceback (most recent call last): File "./prep.py", line 192, in sss = StratifiedShuffleSplit(t_class, 1, test_size=opts.validpct, random_state=0) File "/home/user1/.local/lib/python2.7/site-packages/sklearn/cross_validation.py", line 1073, in init if np.min(np.bincount(self.y_indices)) < 2: File "/home/user1/.local/lib/python2.7/site-packages/numpy/core/fromnumeric.py", line 2442, in amin initial=initial) File "/home/user1/.local/lib/python2.7/site-packages/numpy/core/fromnumeric.py", line 83, in _wrapreduction return ufunc.reduce(obj, axis, dtype, out, **passkwargs) ValueError: zero-size array to reduction operation minimum which has no identity Loading embeddings... Traceback (most recent call last): File "./adr_label.py", line 437, in train_set, valid_set, test_set, dic = prep.load_adefull(opts.picklefile) TypeError: 'NoneType' object is not iterable

I removed some runtime warnings by theano and numpy.

acocos commented 6 years ago

@o0windseed0o it looks like one or more arrays in the pickle file you're trying to load are empty.

Have you checked to see whether the processed train and test files are generated by the calls to create_adr_dataset (lines 161 and 162 of prep.py? They should be written to the following files:

./data/seq_labeling/processed/train/asu_fullanno_train ./data/seq_labeling/processed/train/chop_fullanno_train ./data/seq_labeling/processed/test/asu_fullanno_test ./data/seq_labeling/processed/test/chop_fullanno_test

If any of these has zero lines, check to see whether the download_tweets.py script is working for you. Running it today, I got a return of 207 tweets in the test set and 594 tweets in the training set.

o0windseed0o commented 6 years ago

@acocos Aha, I checked those files and found all of them are empty. Maybe the download script not works for me. Any idea how this happens? Perhaps a firewall?

acocos commented 6 years ago

Not sure. You can try stepping through the tweets by calling get_tweet_text(userid, tweetid) for each tweet in one of the input files. The errors returned should give you some more information about what's going on.