Closed Gldkslfmsd closed 3 years ago
in code, there is a word "blue" on several places. It's BLEU -- BiLingual Estimation Understudy, not blue.
sacrebleu should be used rather than nltk implementation of BLEU, because sacrebleu is now a standard. ( https://arxiv.org/abs/1804.08771 )
Hi, now I know what is mwerSegmenter. However, does SLTev works when used in parallel in the same directory? mwerSegmenter products result as __segments in current dir, that's horrible.
Sharing binary of mwerSegmenter is horrible, but I saw that it's probably standard in SLT field :( Even Jan Niehues does it, I googled it somewhere.
Hi, Mohammad, Ebrahim, this issue is now pretty old and it also contains a number of separate issues, e.g. the comment from Feb 20, 2020 calls for a separate issue: "SLTev must be fully parallelizable", in other words, it must be possible to run it many times at once, in parallel.
Please go over this issue and create separate issues fro everything Dominik mentions. The separate issues should refer to this one, simply by mentioning #1. Once you have broken up this issue, please close this one, keep the others open until resolved.
Most of them have been resolved and two new issues have been created for the rest (#13 and #14 ).
Hi, Ebrahim,
I have a feedback on this tool:
I need more explanations to be able to use it. What is mWER? Mean Word Error Rate? Zero_T? automatic segmentation mWER? Why is this measure important?
README claims that --align argument is optional, but I get an error when I omit it:
too many decimal numbers
what is mwerSegmenter ?
if you want to publish this repo, you shouldn't include precompiled binary of mwerSegmenter, but rather installation script which does everything in one single step:
./install.sh
giza installation and usage is too complicated. There should be one
./install.sh
script.using giza as indicated in README, separately in so many steps, is too complicated. Why can't Python run it on its own, if it's using it?
better than requiring nltk in user's pip it's better to require it in the virtual environment. What if someone needs a weird outdated version of nltk for different project?
I hope it works for Python3
Best,
Dominik