aritter / twitter_nlp

Twitter NLP Tools
GNU General Public License v3.0
883 stars 382 forks source link

IOError: [Errno 32] Broken pipe #14

Open sunningboy opened 8 years ago

sunningboy commented 8 years ago

OS is mac air OX.

How can solve it?

$ cat test.1k.txt | python python/ner/extractEntities2.py /bin/sh: .//python/cap/cap_classify: cannot execute binary file Traceback (most recent call last): File "python/ner/extractEntities2.py", line 131, in goodCap = capClassifier.Classify(words) > 0.9 File ".//python/cap/cap_classifier.py", line 33, in Classify self.capClassifier.stdin.write("%s\n" % self.fe.Extract(' '.join(words))) IOError: [Errno 32] Broken pipe

mdtareque commented 8 years ago

Even on Ubuntu 15.10

$ cat test.1k.txt | python python/ner/extractEntities2.py 
.//python/cap/cap_classify: 1: .//python/cap/cap_classify: Syntax error: ")" unexpected
Traceback (most recent call last):
  File "python/ner/extractEntities2.py", line 131, in <module>
    goodCap = capClassifier.Classify(words) > 0.9
  File ".//python/cap/cap_classifier.py", line 33, in Classify
    self.capClassifier.stdin.write("%s\n" % self.fe.Extract(' '.join(words)))
IOError: [Errno 32] Broken pipe
sruteesh commented 8 years ago

Any solutions?? Have the same issue

aritter commented 8 years ago

Maybe try running build.sh

-Alan

napsternxg commented 8 years ago

@sunningboy, @mdtareque and @sruteesh can you try using my branch mentioned in Pull request #16, it has a much simpler interface for using an input file instead of piping the input from cat.

sruteesh commented 8 years ago

Hi @napsternxg and @aritter Thanks for the prompt response. But the issue still seems to be there. I'm still getting the same error. I tried trying it out on my friend's laptop. Same issue.

`$ python python/ner/extractEntities.py test.1k.txt

Starting with the following configuration

Input file: test.1k.txt Text Position: 0 Output file: None Chunk: False POS: False Event: False Classify: False

Mallet Memory: 256m

No output file given. Will write to STDOUT. Error: Could not find or load main class cc.mallet.fst.SimpleTaggerStdin /bin/sh: .//python/cap/cap_classify: cannot execute binary file: Exec format error Finished loading all models. Now reading from test.1k.txt and writing to None Traceback (most recent call last): File "python/ner/extractEntities.py", line 167, in goodCap = capClassifier.Classify(words) > 0.9 File ".//python/cap/cap_classifier.py", line 33, in Classify self.capClassifier.stdin.write("%s\n" % self.fe.Extract(' '.join(words))) IOError: [Errno 32] Broken pipe ` Please look into the matter.

chandra589 commented 8 years ago

@napsternxg and @aritter I'm getting the same error as @sruteesh is having. Can you please look into it.

napsternxg commented 8 years ago

@sruteesh @chandra589 I tried it just now on my machine. It works fine. Can you give details about your system ?

sruteesh commented 8 years ago

@napsternxg I am using pyhton 3.4 and running it on Cygwinx64 terminal from windows 10.

napsternxg commented 8 years ago

I think this code is compatible with only Python 2.7 because of the print statement formats. Try with Python 2.7.

sidewallme commented 8 years ago

I have the same issue. Any solutions?

joanaz commented 8 years ago

I'm having the same issue.

/bin/sh: .//python/cap/cap_classify: No such file or directory
Finished loading all models. Now reading from test.1k.txt and writing to output.txt
Traceback (most recent call last):
  File "python/ner/extractEntities.py", line 167, in <module>
    goodCap = capClassifier.Classify(words) > 0.9
  File ".//python/cap/cap_classifier.py", line 33, in Classify
    self.capClassifier.stdin.write("%s\n" % self.fe.Extract(' '.join(words)))
IOError: [Errno 32] Broken pipe

I also got an error when running the build.sh:

warning: optimization level '-O9' is not supported; using '-O3' instead
In file included from param.cpp:28:
In file included from ./common.h:76:
./getopt.h:131:12: error: conflicting types for 'getopt'
extern int getopt ();
           ^
/Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX10.11.sdk/usr/include/unistd.h:508:6: note: 
      previous declaration is here
int      getopt(int, char * const [], const char *) __DARWIN_ALIAS(getopt);
         ^
param.cpp:217:17: warning: conversion from string literal to 'char *' is
      deprecated [-Wc++11-compat-deprecated-writable-strings]
    char *tmp = "TinySVM::Param::set";
                ^
2 warnings and 1 error generated.
make[2]: *** [param.lo] Error 1
make[1]: *** [all-recursive] Error 1
make: *** [all-recursive-am] Error 2
ld: library not found for -lcrt0.o
clang: error: linker command failed with exit code 1 (use -v to see invocation)

I'm using python 2.7.

rksaxena commented 8 years ago

Anybody got a solution? Facing the same issue

vikotse commented 8 years ago

I have the same issue.

`

Starting with the following configuration

Input file: test.1k.txt Text Position: 0 Output file: output.txt Chunk: False POS: False Event: False Classify: False

Mallet Memory: 256m

/bin/sh: 1: java: not found Finished loading all models. Now reading from test.1k.txt and writing to output.txt Traceback (most recent call last): File "python/ner/extractEntities.py", line 197, in ner.stdin.write(("\t".join(seq_features) + "\n").encode('utf8')) IOError: [Errno 32] Broken pipe `

After I tried to run build.sh, this issue is still existing.

My python version: 2.7.12

DerekZH commented 8 years ago

same issue here, python version is Python 2.7.10. Would be great to see a fix. @napsternxg @aritter

DaehanKim commented 7 years ago

I have some similar issue.

Anonymous:~/twitter_nlp$ python python/ner/extractEntities.py test.1k.txt -o output.txtStarting with the following configuration
----------------------------------------
Input file: test.1k.txt
Text Position: 0
Output file: output.txt
Chunk: False
POS: False
Event: False
Classify: False
Mallet Memory: 256m
----------------------------------------
/bin/sh: 1: java: not found
Finished loading all models. Now reading from test.1k.txt and writing to output.txt
Traceback (most recent call last):
  File "python/ner/extractEntities.py", line 197, in <module>
    ner.stdin.write(("\t".join(seq_features) + "\n").encode('utf8'))
IOError: [Errno 32] Broken pipe

And I found out that running ./build.sh cause some errors like :

...

rm -f .libs/param.lo
c++ -DHAVE_CONFIG_H -I. -I. -I.. -Wall -O9 -funroll-all-loops -finline -ffast-math -c param.cpp  -fPIC -DPIC -o .libs/param.lo
param.cpp: In member function 'int TinySVM::Param::set(const char*)':
param.cpp:217:17: warning: deprecated conversion from string constant to 'char*' [-Wwrite-strings]
     char *tmp = "TinySVM::Param::set";
                 ^
c++ -DHAVE_CONFIG_H -I. -I. -I.. -Wall -O9 -funroll-all-loops -finline -ffast-math -c param.cpp -o param.o >/dev/null 2>&1
mv -f .libs/param.lo param.lo

...

rm -f .libs/model.lo
c++ -DHAVE_CONFIG_H -I. -I. -I.. -Wall -O9 -funroll-all-loops -finline -ffast-math -c model.cpp  -fPIC -DPIC -o .libs/model.lo
model.cpp: In member function 'virtual int TinySVM::Model::read(const char*, const char*, int)':
model.cpp:230:57: warning: ignoring return value of 'int fscanf(FILE*, const char*, ...)', declared with attribute warn_unused_result [-Wunused-result]
   fscanf (fp, "%s Version %s%*[^\n]\n", tmpbuf, version);
                                                         ^
model.cpp:231:50: warning: ignoring return value of 'int fscanf(FILE*, const char*, ...)', declared with attribute warn_unused_result [-Wunused-result]
   fscanf (fp, "%d%*[^\n]\n",  &param.kernel_type);
                                                  ^
model.cpp:232:45: warning: ignoring return value of 'int fscanf(FILE*, const char*, ...)', declared with attribute warn_unused_result [-Wunused-result]
   fscanf (fp, "%d%*[^\n]\n",  &param.degree);
                                             ^
model.cpp:233:46: warning: ignoring return value of 'int fscanf(FILE*, const char*, ...)', declared with attribute warn_unused_result [-Wunused-result]
   fscanf (fp, "%lf%*[^\n]\n", &param.param_g);
                                              ^
model.cpp:234:46: warning: ignoring return value of 'int fscanf(FILE*, const char*, ...)', declared with attribute warn_unused_result [-Wunused-result]
   fscanf (fp, "%lf%*[^\n]\n", &param.param_s);
                                              ^
model.cpp:235:46: warning: ignoring return value of 'int fscanf(FILE*, const char*, ...)', declared with attribute warn_unused_result [-Wunused-result]
   fscanf (fp, "%lf%*[^\n]\n", &param.param_r);
                                              ^
model.cpp:236:38: warning: ignoring return value of 'int fscanf(FILE*, const char*, ...)', declared with attribute warn_unused_result [-Wunused-result]
   fscanf (fp, "%s%*[^\n]\n",  tmpbuf);
                                      ^
model.cpp:249:34: warning: ignoring return value of 'int fscanf(FILE*, const char*, ...)', declared with attribute warn_unused_result [-Wunused-result]
   fscanf (fp, "%lf%*[^\n]\n", &b);
                                  ^
jamestch commented 7 years ago

I hava met the similar problem, has anyone get a solution? when I run "python python/ner/extractEntities.py test.1k.txt -o output.txt", it got an exception as below:

Starting with the following configuration

Input file: test.1k.txt Text Position: 0 Output file: output.txt Chunk: False POS: False Event: False Classify: False Mallet Memory: 256m

Exception in thread "main" java.lang.NoClassDefFoundError: bsh/Interpreter at java.lang.ClassLoader.defineClass1(Native Method) at java.lang.ClassLoader.defineClass(ClassLoader.java:760) at java.security.SecureClassLoader.defineClass(SecureClassLoader.java:142) at java.net.URLClassLoader.defineClass(URLClassLoader.java:467) at java.net.URLClassLoader.access$100(URLClassLoader.java:73) at java.net.URLClassLoader$1.run(URLClassLoader.java:368) at java.net.URLClassLoader$1.run(URLClassLoader.java:362) at java.security.AccessController.doPrivileged(Native Method) at java.net.URLClassLoader.findClass(URLClassLoader.java:361) at java.lang.ClassLoader.loadClass(ClassLoader.java:424) at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:331) at java.lang.ClassLoader.loadClass(ClassLoader.java:357) at cc.mallet.util.CommandOption.(CommandOption.java:62) at cc.mallet.util.CommandOption$Double.(CommandOption.java:483) at cc.mallet.fst.SimpleTaggerStdin.(SimpleTaggerStdin.java:173) Caused by: java.lang.ClassNotFoundException: bsh.Interpreter at java.net.URLClassLoader.findClass(URLClassLoader.java:381) at java.lang.ClassLoader.loadClass(ClassLoader.java:424) at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:331) at java.lang.ClassLoader.loadClass(ClassLoader.java:357) ... 15 more Finished loading all models. Now reading from test.1k.txt and writing to output.txt Traceback (most recent call last): File "python/ner/extractEntities.py", line 197, in ner.stdin.write(("\t".join(seq_features) + "\n").encode('utf8')) IOError: [Errno 32] Broken pipe

Johnsonxiong commented 6 years ago

Hey, guys, good news for Mac users!

I ended up being able to build on Mac (10.12.6) by doing the following:

Download TinySVM and unzip it to the folder of twitter_nlp-master: https://github.com/shogo82148/TinySVM

Change build.sh in twitter_nlp-master as below: cd hbc/models gcc -O3 labels.c stats.c samplib.c LabeledLDA_infer_stdin.c -o LabeledLDA_infer_stdin.out -lm cd ../../TinySVM ./configure --prefix=pwd/../ && make && make install cd ../python/cap ./build.sh

Then change the python/cap/build.sh into like below:

c++ -o cap_classify cap_classify.cpp -ltinysvm

Then run $ ./build.sh $ export TWITTER_NLP=./ $ cat test.1k.txt | python2 python/ner/extractEntities2.py

Dont forget to install tinysvm anyway using : $ brew install tinysvm

This works! Below is part of the result: @Jessica_Chobot/O did/O you/O see/O the/O yakuza/B-ENTITY vs/O zombies/O ..../O smh/O but/O cool/O at/O the/O same/O time/O RT/O @daviddesrosiers/O :/O Happy/O birthday/O @chuckcomeau/O !/O have/O fun/O in/O vancouver/B-ENTITY tonight/O !/O Spotted/O :/O Kanye/B-ENTITY West/I-ENTITY Celebrates/O LAMB/B-ENTITY With/O Gwen/B-ENTITY Stefani/I-ENTITY :/O New/B-ENTITY York/I-ENTITY Fashion/I-ENTITY Week/I-ENTITY is/O coming/O to/O a/O close/O ,/O but/O not/O before/O .../O http://bit.ly/cSyZUi/O @zeeDOTi/O i/O might/O join/O in/O if/O I/O make/O it/O home/O in/O time/O ./O :)/O

Hope this helps.

vrnmthr commented 5 years ago

@Johnsonxiong worked for me -- thanks! maybe someone can create a PR for this.

the only thing I had to change was in the $PWD reference in ./build.sh:

cd hbc/models
gcc -O3 labels.c stats.c samplib.c LabeledLDA_infer_stdin.c -o LabeledLDA_infer_stdin.out -lm
cd ../../TinySVM
./configure --prefix=$PWD/../ && make && make install
cd ../python/cap
./build.sh
chandrasg commented 4 years ago

TinySVM no longer has configure file it seems: https://github.com/shogo82148/TinySVM