Open GoogleCodeExporter opened 9 years ago
This issue was closed by revision c7655591f003.
Original comment by Achi...@gmail.com
on 6 Sep 2013 at 4:56
Accidentially closed with checkin.
Original comment by Achi...@gmail.com
on 6 Sep 2013 at 4:58
Original comment by Achi...@gmail.com
on 6 Sep 2013 at 5:02
Original comment by Achi...@gmail.com
on 25 Sep 2013 at 7:38
Well, multithreaded version of tokenizer.perl works in batches, so it tries to
read a couple of lines into array and then process them in separate threads.
Our wrap_tokenizer.pm/tokenize_str function sends just single line, and then
waits for the response (see lines 376-377), but since the multithread version
needs more strings, before it returns first result, they both block each other.
Thus it involves major modification to change the logic in wrap_tokenizer.pm.
Original comment by toma...@moravia.com
on 14 Mar 2014 at 2:18
If considering major modifications in the logic of wrap_tokenizer.pm (also
wrap_detokenizer.pm), please keep possible support for Windows in mind.
Possible ideas:
* tokenizer pre-processing and post-processing (e.g. markup placeholders)
* fixing up markup after tokenizer messes it up (similar to fix_markup_ws.pm)
* fork tokenizer/detokenizer or add option to main branch (see new ignore
option in tokenizer)
Original comment by Achi...@gmail.com
on 16 Mar 2014 at 4:24
Original issue reported on code.google.com by
xhu...@gmail.com
on 29 Apr 2013 at 3:41