Open hoogang opened 6 years ago
I follow the setup, " install corenlp with Chinese package according to CoreNLP offical, you may specific classpath in environment or in file drqa\tokenizers\Zh_tokenizer.py. Then you may download vectors and training sets to start your work." but it does‘t work
I try: from drqa.tokenizers import CoreNLPTokenizer tok = CoreNLPTokenizer()
run this script
“ python scripts/reader/preprocess.py data/datasets data/datasets --split SQuAD-v1.1-train --tokenizer corenlp ”
but I test CoreNLPTokenizer in Chinese word segmentation。
>>> from drqa.tokenizers import CoreNLPTokenizer
>>> tok = CoreNLPTokenizer()
[init tokenizer done]
>>> tok.tokenize('hello world 湖北省武汉市公共交通系统').words()
['hello', 'world', '湖北省', '武汉市', '公共', '交通', '系统']
cmd is OK see as as follows
hugang@server-white:~$ java -mx3g -cp "/home/hugang/DrQA/data/corenlp/*" edu.stanford.nlp.pipeline.StanfordCoreNLP -annotators tokenize,ssplit,pos,lemma,ner -props StanfordCoreNLP-chinese.properties
[main] INFO edu.stanford.nlp.pipeline.StanfordCoreNLP - Adding annotator tokenize
[main] INFO edu.stanford.nlp.wordseg.ChineseDictionary - Loading Chinese dictionaries from 1 file:
[main] INFO edu.stanford.nlp.wordseg.ChineseDictionary - edu/stanford/nlp/models/segmenter/chinese/dict-chris6.ser.gz
[main] INFO edu.stanford.nlp.wordseg.ChineseDictionary - Done. Unique words in ChineseDictionary is: 423200.
[main] INFO edu.stanford.nlp.ie.AbstractSequenceClassifier - Loading classifier from edu/stanford/nlp/models/segmenter/chinese/ctb.gz ... done [12.3 sec].
[main] INFO edu.stanford.nlp.pipeline.StanfordCoreNLP - Adding annotator ssplit
[main] INFO edu.stanford.nlp.pipeline.StanfordCoreNLP - Adding annotator pos
[main] INFO edu.stanford.nlp.tagger.maxent.MaxentTagger - Loading POS tagger from edu/stanford/nlp/models/pos-tagger/chinese-distsim/chinese-distsim.tagger ... done [2.9 sec].
[main] INFO edu.stanford.nlp.pipeline.StanfordCoreNLP - Adding annotator lemma
[main] INFO edu.stanford.nlp.pipeline.StanfordCoreNLP - Adding annotator ner
[main] INFO edu.stanford.nlp.ie.AbstractSequenceClassifier - Loading classifier from edu/stanford/nlp/models/ner/chinese.misc.distsim.crf.ser.gz ... done [3.8 sec].
Entering interactive shell. Type q RETURN or EOF to quit.
NLP> 湖北省武安市 今天天气很不错 可以出去郊游
Sentence #1 (9 tokens):
湖北省武安市 今天天气很不错 可以出去郊游
[Text=湖北省 CharacterOffsetBegin=0 CharacterOffsetEnd=3 PartOfSpeech=NR Lemma=湖北省 NamedEntityTag=GPE]
[Text=武安市 CharacterOffsetBegin=3 CharacterOffsetEnd=6 PartOfSpeech=NR Lemma=武安市 NamedEntityTag=GPE]
[Text=今天 CharacterOffsetBegin=7 CharacterOffsetEnd=9 PartOfSpeech=NT Lemma=今天 NamedEntityTag=DATE NormalizedNamedEntityTag=XXXX-XX-XX]
[Text=天气 CharacterOffsetBegin=9 CharacterOffsetEnd=11 PartOfSpeech=NN Lemma=天气 NamedEntityTag=O]
[Text=很 CharacterOffsetBegin=11 CharacterOffsetEnd=12 PartOfSpeech=AD Lemma=很 NamedEntityTag=O]
[Text=不错 CharacterOffsetBegin=12 CharacterOffsetEnd=14 PartOfSpeech=VA Lemma=不错 NamedEntityTag=O]
[Text=可以 CharacterOffsetBegin=15 CharacterOffsetEnd=17 PartOfSpeech=VV Lemma=可以 NamedEntityTag=O]
[Text=出去 CharacterOffsetBegin=17 CharacterOffsetEnd=19 PartOfSpeech=VV Lemma=出去 NamedEntityTag=O]
[Text=郊游 CharacterOffsetBegin=19 CharacterOffsetEnd=21 PartOfSpeech=VV Lemma=郊游 NamedEntityTag=O]
but when I run this script
“python scripts/reader/preprocess.py data/datasets data/datasets --split webqa-test --tokenizer corenlp”
"webqa-test" is test set for Chinese reading comprehension
Traceback (most recent call last):
File "/home/hugang/.pyenv/versions/3.6.0/lib/python3.6/multiprocessing/process.py", line 249, in _bootstrap
self.run()
File "/home/hugang/.pyenv/versions/3.6.0/lib/python3.6/multiprocessing/process.py", line 93, in run
self._target(*self._args, **self._kwargs)
File "/home/hugang/.pyenv/versions/3.6.0/lib/python3.6/multiprocessing/pool.py", line 103, in worker
initializer(*initargs)
File "scripts/reader/preprocess.py", line 29, in init
TOK = tokenizer_class(**options)
File "/home/hugang/DrQA/drqa/tokenizers/corenlp_tokenizer.py", line 37, in __init__
self._launch()
File "/home/hugang/DrQA/drqa/tokenizers/corenlp_tokenizer.py", line 68, in _launch
self.corenlp.expect_exact('NLP>', searchwindowsize=100)
File "/home/hugang/.pyenv/versions/3.6.0/lib/python3.6/site-packages/pexpect/spawnbase.py", line 390, in expect_exact
return exp.expect_loop(timeout)
File "/home/hugang/.pyenv/versions/3.6.0/lib/python3.6/site-packages/pexpect/expect.py", line 107, in expect_loop
return self.timeout(e)
File "/home/hugang/.pyenv/versions/3.6.0/lib/python3.6/site-packages/pexpect/expect.py", line 70, in timeout
raise TIMEOUT(msg)
pexpect.exceptions.TIMEOUT: Timeout exceeded.
<pexpect.pty_spawn.spawn object at 0x7f19c6d072b0>
command: /bin/bash
args: ['/bin/bash']
buffer (last 100 chars): b'er-white:~/DrQA$ [main] INFO edu.stanford.nlp.pipeline.StanfordCoreNLP - Adding annotator tokenize\r\n'
before (last 100 chars): b'er-white:~/DrQA$ [main] INFO edu.stanford.nlp.pipeline.StanfordCoreNLP - Adding annotator tokenize\r\n'
after: <class 'pexpect.exceptions.TIMEOUT'>
match: None
match_index: None
exitstatus: None
flag_eof: False
pid: 15458
child_fd: 21
closed: False
timeout: 60
delimiter: <class 'pexpect.exceptions.EOF'>
logfile: None
logfile_read: None
logfile_send: None
maxread: 100000
ignorecase: False
searchwindowsize: None
delaybeforesend: 0
delayafterclose: 0.1
delayafterterminate: 0.1
searcher: searcher_string:
0: "b'NLP>'"
error is same last time, I try lots of methods and still can't solve this problem, which makes me confused, please help me.
First, you need to change corenlp path directly in my code. You should find a fixme in ZhTokenizer class. Second, as an old issue reported, pexpect package ver 4.4 may have unwanted behavior, make sure you have a latest build. If you still meets the problem, try to print actual command directly and see if it can work.
你好,我是华师的同僚,我也遇到这个问题了,困扰了好久,请问你是怎么解决的啊,好难受啊
I try: from drqa.tokenizers import CoreNLPTokenizer tok = CoreNLPTokenizer()
run this script
“ python scripts/reader/preprocess.py data/datasets data/datasets --split SQuAD-v1.1-train --tokenizer corenlp ”
is all ok
but I test CoreNLPTokenizer in Chinese word segmentation。
>>> from drqa.tokenizers import CoreNLPTokenizer >>> tok = CoreNLPTokenizer() [init tokenizer done] >>> tok.tokenize('hello world 湖北省武汉市公共交通系统').words() ['hello', 'world', '湖北省', '武汉市', '公共', '交通', '系统']
cmd is OK see as as follows
hugang@server-white:~$ java -mx3g -cp "/home/hugang/DrQA/data/corenlp/*" edu.stanford.nlp.pipeline.StanfordCoreNLP -annotators tokenize,ssplit,pos,lemma,ner -props StanfordCoreNLP-chinese.properties [main] INFO edu.stanford.nlp.pipeline.StanfordCoreNLP - Adding annotator tokenize [main] INFO edu.stanford.nlp.wordseg.ChineseDictionary - Loading Chinese dictionaries from 1 file: [main] INFO edu.stanford.nlp.wordseg.ChineseDictionary - edu/stanford/nlp/models/segmenter/chinese/dict-chris6.ser.gz [main] INFO edu.stanford.nlp.wordseg.ChineseDictionary - Done. Unique words in ChineseDictionary is: 423200. [main] INFO edu.stanford.nlp.ie.AbstractSequenceClassifier - Loading classifier from edu/stanford/nlp/models/segmenter/chinese/ctb.gz ... done [12.3 sec]. [main] INFO edu.stanford.nlp.pipeline.StanfordCoreNLP - Adding annotator ssplit [main] INFO edu.stanford.nlp.pipeline.StanfordCoreNLP - Adding annotator pos [main] INFO edu.stanford.nlp.tagger.maxent.MaxentTagger - Loading POS tagger from edu/stanford/nlp/models/pos-tagger/chinese-distsim/chinese-distsim.tagger ... done [2.9 sec]. [main] INFO edu.stanford.nlp.pipeline.StanfordCoreNLP - Adding annotator lemma [main] INFO edu.stanford.nlp.pipeline.StanfordCoreNLP - Adding annotator ner [main] INFO edu.stanford.nlp.ie.AbstractSequenceClassifier - Loading classifier from edu/stanford/nlp/models/ner/chinese.misc.distsim.crf.ser.gz ... done [3.8 sec]. Entering interactive shell. Type q RETURN or EOF to quit. NLP> 湖北省武安市 今天天气很不错 可以出去郊游 Sentence #1 (9 tokens): 湖北省武安市 今天天气很不错 可以出去郊游 [Text=湖北省 CharacterOffsetBegin=0 CharacterOffsetEnd=3 PartOfSpeech=NR Lemma=湖北省 NamedEntityTag=GPE] [Text=武安市 CharacterOffsetBegin=3 CharacterOffsetEnd=6 PartOfSpeech=NR Lemma=武安市 NamedEntityTag=GPE] [Text=今天 CharacterOffsetBegin=7 CharacterOffsetEnd=9 PartOfSpeech=NT Lemma=今天 NamedEntityTag=DATE NormalizedNamedEntityTag=XXXX-XX-XX] [Text=天气 CharacterOffsetBegin=9 CharacterOffsetEnd=11 PartOfSpeech=NN Lemma=天气 NamedEntityTag=O] [Text=很 CharacterOffsetBegin=11 CharacterOffsetEnd=12 PartOfSpeech=AD Lemma=很 NamedEntityTag=O] [Text=不错 CharacterOffsetBegin=12 CharacterOffsetEnd=14 PartOfSpeech=VA Lemma=不错 NamedEntityTag=O] [Text=可以 CharacterOffsetBegin=15 CharacterOffsetEnd=17 PartOfSpeech=VV Lemma=可以 NamedEntityTag=O] [Text=出去 CharacterOffsetBegin=17 CharacterOffsetEnd=19 PartOfSpeech=VV Lemma=出去 NamedEntityTag=O] [Text=郊游 CharacterOffsetBegin=19 CharacterOffsetEnd=21 PartOfSpeech=VV Lemma=郊游 NamedEntityTag=O]
but when I run this script
“python scripts/reader/preprocess.py data/datasets data/datasets --split webqa-test --tokenizer corenlp”
"webqa-test" is test set for Chinese reading comprehension
Traceback (most recent call last): File "/home/hugang/.pyenv/versions/3.6.0/lib/python3.6/multiprocessing/process.py", line 249, in _bootstrap self.run() File "/home/hugang/.pyenv/versions/3.6.0/lib/python3.6/multiprocessing/process.py", line 93, in run self._target(*self._args, **self._kwargs) File "/home/hugang/.pyenv/versions/3.6.0/lib/python3.6/multiprocessing/pool.py", line 103, in worker initializer(*initargs) File "scripts/reader/preprocess.py", line 29, in init TOK = tokenizer_class(**options) File "/home/hugang/DrQA/drqa/tokenizers/corenlp_tokenizer.py", line 37, in __init__ self._launch() File "/home/hugang/DrQA/drqa/tokenizers/corenlp_tokenizer.py", line 68, in _launch self.corenlp.expect_exact('NLP>', searchwindowsize=100) File "/home/hugang/.pyenv/versions/3.6.0/lib/python3.6/site-packages/pexpect/spawnbase.py", line 390, in expect_exact return exp.expect_loop(timeout) File "/home/hugang/.pyenv/versions/3.6.0/lib/python3.6/site-packages/pexpect/expect.py", line 107, in expect_loop return self.timeout(e) File "/home/hugang/.pyenv/versions/3.6.0/lib/python3.6/site-packages/pexpect/expect.py", line 70, in timeout raise TIMEOUT(msg) pexpect.exceptions.TIMEOUT: Timeout exceeded. <pexpect.pty_spawn.spawn object at 0x7f19c6d072b0> command: /bin/bash args: ['/bin/bash'] buffer (last 100 chars): b'er-white:~/DrQA$ [main] INFO edu.stanford.nlp.pipeline.StanfordCoreNLP - Adding annotator tokenize\r\n' before (last 100 chars): b'er-white:~/DrQA$ [main] INFO edu.stanford.nlp.pipeline.StanfordCoreNLP - Adding annotator tokenize\r\n' after: <class 'pexpect.exceptions.TIMEOUT'> match: None match_index: None exitstatus: None flag_eof: False pid: 15458 child_fd: 21 closed: False timeout: 60 delimiter: <class 'pexpect.exceptions.EOF'> logfile: None logfile_read: None logfile_send: None maxread: 100000 ignorecase: False searchwindowsize: None delaybeforesend: 0 delayafterclose: 0.1 delayafterterminate: 0.1 searcher: searcher_string: 0: "b'NLP>'"
error is same last time, I try lots of methods and still can't solve this problem, which makes me confused, please help me.
你好,我是对面华师的同僚,我也遇到这个问题了,困扰了好久,请问你是怎么解决的啊,好难受啊
你好,我是华师的同僚,我也遇到这个问题了,困扰了好久,请问你是怎么解决的啊,好难受啊 I solve the bug following:
python scripts/reader/preprocess.py data/datasets data/datasets --split webqa-test --tokenizer corenlp - --workers 1 add thread parameter (--workers 1 )
But I meet another problem:
followed DrQA, I used the processed Chinese Wikipedia data to transform the Tfidf-model data.
hugang@server-white:~/DrQA$ python ./scripts/retriever/build_tfidf.py data/wikipedia/wiki_zhs.db data/wikipedia --ngram 4 --hash-size 2 --tokenizer corenlp
12/27/2018 09:07:12 PM: [ Counting words... ]
12/27/2018 09:07:14 PM: [ Mapping... ]
12/27/2018 09:07:14 PM: [ -------------------------Batch 1/11------------------------- ]
[init tokenizer done]
[init tokenizer done]
[init tokenizer done]
[init tokenizer done]
[init tokenizer done]
[init tokenizer done]
[init tokenizer done]
Process ForkPoolWorker-5:
Traceback (most recent call last):
File "/home/hugang/.pyenv/versions/3.6.0/lib/python3.6/site-packages/pexpect/expect.py", line 111, in expect_loop
incoming = spawn.read_nonblocking(spawn.maxread, timeout)
File "/home/hugang/.pyenv/versions/3.6.0/lib/python3.6/site-packages/pexpect/pty_spawn.py", line 482, in read_nonblocking
raise TIMEOUT('Timeout exceeded.')
pexpect.exceptions.TIMEOUT: Timeout exceeded.
你好,我是华师的同僚,我也遇到这个问题了,困扰了好久,请问你是怎么解决的啊,好难受啊 I solve the bug following:
python scripts/reader/preprocess.py data/datasets data/datasets --split webqa-test --tokenizer corenlp - --workers 1 add thread parameter (--workers 1 )
But I meet another problem: followed DrQA, I used the processed Chinese Wikipedia data to transform the Tfidf-model data.
hugang@server-white:~/DrQA$ python ./scripts/retriever/build_tfidf.py data/wikipedia/wiki_zhs.db data/wikipedia --ngram 4 --hash-size 2 --tokenizer corenlp 12/27/2018 09:07:12 PM: [ Counting words... ] 12/27/2018 09:07:14 PM: [ Mapping... ] 12/27/2018 09:07:14 PM: [ -------------------------Batch 1/11------------------------- ] [init tokenizer done] [init tokenizer done] [init tokenizer done] [init tokenizer done] [init tokenizer done] [init tokenizer done] [init tokenizer done] Process ForkPoolWorker-5: Traceback (most recent call last): File "/home/hugang/.pyenv/versions/3.6.0/lib/python3.6/site-packages/pexpect/expect.py", line 111, in expect_loop incoming = spawn.read_nonblocking(spawn.maxread, timeout) File "/home/hugang/.pyenv/versions/3.6.0/lib/python3.6/site-packages/pexpect/pty_spawn.py", line 482, in read_nonblocking raise TIMEOUT('Timeout exceeded.') pexpect.exceptions.TIMEOUT: Timeout exceeded.
谢谢解答,可以留个联系方式吗,我最近也在研究这个,有问题可以一起交流讨论一下。
你好,我是华师的同僚,我也遇到这个问题了,困扰了好久,请问你是怎么解决的啊,好难受啊 I solve the bug following:
python scripts/reader/preprocess.py data/datasets data/datasets --split webqa-test --tokenizer corenlp - --workers 1 add thread parameter (--workers 1 )
But I meet another problem: followed DrQA, I used the processed Chinese Wikipedia data to transform the Tfidf-model data.hugang@server-white:~/DrQA$ python ./scripts/retriever/build_tfidf.py data/wikipedia/wiki_zhs.db data/wikipedia --ngram 4 --hash-size 2 --tokenizer corenlp 12/27/2018 09:07:12 PM: [ Counting words... ] 12/27/2018 09:07:14 PM: [ Mapping... ] 12/27/2018 09:07:14 PM: [ -------------------------Batch 1/11------------------------- ] [init tokenizer done] [init tokenizer done] [init tokenizer done] [init tokenizer done] [init tokenizer done] [init tokenizer done] [init tokenizer done] Process ForkPoolWorker-5: Traceback (most recent call last): File "/home/hugang/.pyenv/versions/3.6.0/lib/python3.6/site-packages/pexpect/expect.py", line 111, in expect_loop incoming = spawn.read_nonblocking(spawn.maxread, timeout) File "/home/hugang/.pyenv/versions/3.6.0/lib/python3.6/site-packages/pexpect/pty_spawn.py", line 482, in read_nonblocking raise TIMEOUT('Timeout exceeded.') pexpect.exceptions.TIMEOUT: Timeout exceeded.
谢谢解答,可以留个联系方式吗,我最近也在研究这个,有问题可以一起交流讨论一下。
联系 QQ 349359883
I try: from drqa.tokenizers import CoreNLPTokenizer tok = CoreNLPTokenizer() run this script
“ python scripts/reader/preprocess.py data/datasets data/datasets --split SQuAD-v1.1-train --tokenizer corenlp ”
is all ok
but I test CoreNLPTokenizer in Chinese word segmentation。
>>> from drqa.tokenizers import CoreNLPTokenizer >>> tok = CoreNLPTokenizer() [init tokenizer done] >>> tok.tokenize('hello world 湖北省武汉市公共交通系统').words() ['hello', 'world', '湖北省', '武汉市', '公共', '交通', '系统']
cmd is OK see as as follows
hugang@server-white:~$ java -mx3g -cp "/home/hugang/DrQA/data/corenlp/*" edu.stanford.nlp.pipeline.StanfordCoreNLP -annotators tokenize,ssplit,pos,lemma,ner -props StanfordCoreNLP-chinese.properties [main] INFO edu.stanford.nlp.pipeline.StanfordCoreNLP - Adding annotator tokenize [main] INFO edu.stanford.nlp.wordseg.ChineseDictionary - Loading Chinese dictionaries from 1 file: [main] INFO edu.stanford.nlp.wordseg.ChineseDictionary - edu/stanford/nlp/models/segmenter/chinese/dict-chris6.ser.gz [main] INFO edu.stanford.nlp.wordseg.ChineseDictionary - Done. Unique words in ChineseDictionary is: 423200. [main] INFO edu.stanford.nlp.ie.AbstractSequenceClassifier - Loading classifier from edu/stanford/nlp/models/segmenter/chinese/ctb.gz ... done [12.3 sec]. [main] INFO edu.stanford.nlp.pipeline.StanfordCoreNLP - Adding annotator ssplit [main] INFO edu.stanford.nlp.pipeline.StanfordCoreNLP - Adding annotator pos [main] INFO edu.stanford.nlp.tagger.maxent.MaxentTagger - Loading POS tagger from edu/stanford/nlp/models/pos-tagger/chinese-distsim/chinese-distsim.tagger ... done [2.9 sec]. [main] INFO edu.stanford.nlp.pipeline.StanfordCoreNLP - Adding annotator lemma [main] INFO edu.stanford.nlp.pipeline.StanfordCoreNLP - Adding annotator ner [main] INFO edu.stanford.nlp.ie.AbstractSequenceClassifier - Loading classifier from edu/stanford/nlp/models/ner/chinese.misc.distsim.crf.ser.gz ... done [3.8 sec]. Entering interactive shell. Type q RETURN or EOF to quit. NLP> 湖北省武安市 今天天气很不错 可以出去郊游 Sentence #1 (9 tokens): 湖北省武安市 今天天气很不错 可以出去郊游 [Text=湖北省 CharacterOffsetBegin=0 CharacterOffsetEnd=3 PartOfSpeech=NR Lemma=湖北省 NamedEntityTag=GPE] [Text=武安市 CharacterOffsetBegin=3 CharacterOffsetEnd=6 PartOfSpeech=NR Lemma=武安市 NamedEntityTag=GPE] [Text=今天 CharacterOffsetBegin=7 CharacterOffsetEnd=9 PartOfSpeech=NT Lemma=今天 NamedEntityTag=DATE NormalizedNamedEntityTag=XXXX-XX-XX] [Text=天气 CharacterOffsetBegin=9 CharacterOffsetEnd=11 PartOfSpeech=NN Lemma=天气 NamedEntityTag=O] [Text=很 CharacterOffsetBegin=11 CharacterOffsetEnd=12 PartOfSpeech=AD Lemma=很 NamedEntityTag=O] [Text=不错 CharacterOffsetBegin=12 CharacterOffsetEnd=14 PartOfSpeech=VA Lemma=不错 NamedEntityTag=O] [Text=可以 CharacterOffsetBegin=15 CharacterOffsetEnd=17 PartOfSpeech=VV Lemma=可以 NamedEntityTag=O] [Text=出去 CharacterOffsetBegin=17 CharacterOffsetEnd=19 PartOfSpeech=VV Lemma=出去 NamedEntityTag=O] [Text=郊游 CharacterOffsetBegin=19 CharacterOffsetEnd=21 PartOfSpeech=VV Lemma=郊游 NamedEntityTag=O]
but when I run this script
“python scripts/reader/preprocess.py data/datasets data/datasets --split webqa-test --tokenizer corenlp”
"webqa-test" is test set for Chinese reading comprehension
Traceback (most recent call last): File "/home/hugang/.pyenv/versions/3.6.0/lib/python3.6/multiprocessing/process.py", line 249, in _bootstrap self.run() File "/home/hugang/.pyenv/versions/3.6.0/lib/python3.6/multiprocessing/process.py", line 93, in run self._target(*self._args, **self._kwargs) File "/home/hugang/.pyenv/versions/3.6.0/lib/python3.6/multiprocessing/pool.py", line 103, in worker initializer(*initargs) File "scripts/reader/preprocess.py", line 29, in init TOK = tokenizer_class(**options) File "/home/hugang/DrQA/drqa/tokenizers/corenlp_tokenizer.py", line 37, in __init__ self._launch() File "/home/hugang/DrQA/drqa/tokenizers/corenlp_tokenizer.py", line 68, in _launch self.corenlp.expect_exact('NLP>', searchwindowsize=100) File "/home/hugang/.pyenv/versions/3.6.0/lib/python3.6/site-packages/pexpect/spawnbase.py", line 390, in expect_exact return exp.expect_loop(timeout) File "/home/hugang/.pyenv/versions/3.6.0/lib/python3.6/site-packages/pexpect/expect.py", line 107, in expect_loop return self.timeout(e) File "/home/hugang/.pyenv/versions/3.6.0/lib/python3.6/site-packages/pexpect/expect.py", line 70, in timeout raise TIMEOUT(msg) pexpect.exceptions.TIMEOUT: Timeout exceeded. <pexpect.pty_spawn.spawn object at 0x7f19c6d072b0> command: /bin/bash args: ['/bin/bash'] buffer (last 100 chars): b'er-white:~/DrQA$ [main] INFO edu.stanford.nlp.pipeline.StanfordCoreNLP - Adding annotator tokenize\r\n' before (last 100 chars): b'er-white:~/DrQA$ [main] INFO edu.stanford.nlp.pipeline.StanfordCoreNLP - Adding annotator tokenize\r\n' after: <class 'pexpect.exceptions.TIMEOUT'> match: None match_index: None exitstatus: None flag_eof: False pid: 15458 child_fd: 21 closed: False timeout: 60 delimiter: <class 'pexpect.exceptions.EOF'> logfile: None logfile_read: None logfile_send: None maxread: 100000 ignorecase: False searchwindowsize: None delaybeforesend: 0 delayafterclose: 0.1 delayafterterminate: 0.1 searcher: searcher_string: 0: "b'NLP>'"
error is same last time, I try lots of methods and still can't solve this problem, which makes me confused, please help me.
你好,我是对面华师的同僚,我也遇到这个问题了,困扰了好久,请问你是怎么解决的啊,好难受啊 这个 bug 解决了,使用linux时 经反复验证 升级 JAVA 11版本 +pexpect 4.6.0 全是最新的 mac系统不用担心。
原因可能有很多,请结合具体错误分析。首先,我接到过反馈表示某些版本的pexpect有问题,请考虑更换不同版本的pexpect。其次,这些错误可能源于分词命令报错,请尝试print执行的具体命令,看看在命令行中能否运行。最后,建议转移到facebook官方源,我的代码好久没有维护了,数据集等等可以作为参考。
原因可能有很多,请结合具体错误分析。首先,我接到过反馈表示某些版本的pexpect有问题,请考虑更换不同版本的pexpect。其次,这些错误可能源于分词命令报错,请尝试print执行的具体命令,看看在命令行中能否运行。最后,建议转移到facebook官方源,我的代码好久没有维护了,数据集等等可以作为参考。
谢谢 ,今天都解决了。。
我遇到相同的问题,该死的脚本输出时加了背景色字符串,导致pexpect 不能匹配