moses-smt / mosesdecoder

Moses, the machine translation system
http://www.statmt.org/moses
GNU Lesser General Public License v2.1
1.58k stars 775 forks source link

simulate-pe.cc out of sync #82

Closed mjdenkowski closed 9 years ago

mjdenkowski commented 9 years ago

It looks like moses-cmd/simulate-pe.cc needs to be updated with all the changes from the moses-cmd and IOWrapper merges. Right now moses doesn't compile with --with-mm.

hieuhoang commented 9 years ago

Uli - can we sort this out together next week? What's the issue with rolling the post-editing changes into the normal command line decoder?

There's gonna be more changes - it a MosesCore deliverable. I don't want anyone working in the moses lib to be lumbered with having to fix a hacky downstream program

ugermann commented 9 years ago

If Moses had a stable API and didn't require a whole graveyard full of global variables whose initialization makes Masonic rituals look trivial, we wouldn't need those "hacky downstream programs", would we?

Why don't you start with filling in the blanks:

class MosesDecoder { public: boost::program_options::program_options&

};

On Sat, Nov 15, 2014 at 7:43 PM, Hieu Hoang notifications@github.com wrote:

Uli - can we sort this out together next week? What's the issue with rolling the post-editing changes into the normal command line decoder?

There's gonna be more changes - it a MosesCore deliverable. I don't want anyone working in the moses lib to be lumbered with having to fix a hacky downstream program

— Reply to this email directly or view it on GitHub https://github.com/moses-smt/mosesdecoder/issues/82#issuecomment-63185741 .

Ulrich Germann Research Associate School of Informatics University of Edinburgh

ugermann commented 9 years ago

If Moses had a stable API and didn't require a whole graveyard full of global variables whose initialization makes Masonic rituals look trivial, we wouldn't need those "hacky downstream programs", would we?

Why don't you start with filling in the blanks:

class InputStructure; class OutputStructure; class MosesDecoder { public: class InputStructure; class OutputStructure;

boost::program_options::program_options& add_options(boost::program_options::program_options& options);

void init();

bool translate(InputStructure const& input, OutputStructure const& output);

};

On Sat, Nov 15, 2014 at 8:50 PM, Ulrich Germann ulrich.germann@gmail.com wrote:

If Moses had a stable API and didn't require a whole graveyard full of global variables whose initialization makes Masonic rituals look trivial, we wouldn't need those "hacky downstream programs", would we?

Why don't you start with filling in the blanks:

class MosesDecoder { public: boost::program_options::program_options&

};

On Sat, Nov 15, 2014 at 7:43 PM, Hieu Hoang notifications@github.com wrote:

Uli - can we sort this out together next week? What's the issue with rolling the post-editing changes into the normal command line decoder?

There's gonna be more changes - it a MosesCore deliverable. I don't want anyone working in the moses lib to be lumbered with having to fix a hacky downstream program

— Reply to this email directly or view it on GitHub https://github.com/moses-smt/mosesdecoder/issues/82#issuecomment-63185741 .

Ulrich Germann Research Associate School of Informatics University of Edinburgh

Ulrich Germann Research Associate School of Informatics University of Edinburgh

hieuhoang commented 9 years ago

I hear you. The Manager roughly correspond to this structure: MosesDecoder=Manager InputStructure=InputType OutputStructure=IOWrapper program_options=StaticData/Parameter It might be that we can add an extra layer so that it corresponds more to what you envisage. More suggestions welcomed. The API will change even more but thats OK. Can't make an omelete without breaking a few eggs.

The command line executables for phrase-based and SCFG have been merged. They only differed in minor but diverging ways and it was my fault they split in the 1st place. Since we've managed to reduce hacky downstream programs by 1, I don't want another 1 popping up. So add your stuff into moses-cmd if you can or we have to figure another way of supporting your exec

Which global variables are you talking about? As far as I know, there's only 3 global variables:

  1. StaticData
  2. Parameter
  3. FactorCollection
ugermann commented 9 years ago

Now supposed I want to write a C++ program that uses two Moses instances in parallel, for example because I want to pipeline two decoders: Language1->Language2 | Language2->Language3. How do I do that with the Moses "Library"?

Why is it up to Moses/Parameter to interpret command line arguments? How can I specify command line arguments for my own program without writing "hacky programs" that filter them out of the argument list before thy pass them on to Moses. How can I get these options included in the Moses help message?

Here's how I think it should be done.

  1. Have LoadParameter relinquish control of User interaction. In particular, don't do anything on -h or when the list of arguments it gets is empty. Instead, provide a list of <option name, help message> pairs (or a list of such pairs, if you want to allow option groups).
  2. LoadParameter should return options that it can't handle, instead of choking on them. That way, I can do the following in my program:

int unknown_args_count=0; char\ unkown_args = NULL; try { Moses::LoadParameter(argc, argv, unknown_args_count, unknown args); } except (...) { }

handle_my_own_args(unknown_args_count, unknown_args);

  1. Ideally, the library would also provide a list of read-made boost program options descriptions (this can be offered in addition to LoadParameter), so that I can do the following:

namespace po=boost::program_options;

string config1,config2;

po::options_description my_options("Program options");

my_options.add_options() ("version","print program version") ("decoder1",po::value(&config1),"config file for decoder 1") ("decoder2",po::value(&config2),"config file for decoder 2");

vectorpo::options_description const& moses_options = Moses::GetProgramOptions();

BOOST_FOREACH(po::options_description const& o, moses_options) my_options.add(o);

// standard options handling from here

Moses::Decoder D1(config1); Moses::Decoder D2(config2); ...

On Sun, Nov 16, 2014 at 8:37 AM, Hieu Hoang notifications@github.com wrote:

I hear you. The Manager roughly correspond to how you want it structured: MosesDecoder=Manager InputStructure=InputType OutputStructure=IOWrapper program_options=StaticData/Parameter It might be that we should add an extra layer so that it corresponds more to what you envisage. More suggestions welcomed.

The command line executables for phrase-based and SCFG have been merged. They only differed in minor but diverging ways and it was my fault they split in the 1st place. However, since we've managed to reduce hacky downstream programs by 1, I don't want another 1 popping up. So get with the program.

Which global variables are you talking about? As far as I know, there's only 3 global variables:

  1. StaticData
  2. Parameter
  3. FactorCollection

— Reply to this email directly or view it on GitHub https://github.com/moses-smt/mosesdecoder/issues/82#issuecomment-63210318 .

Ulrich Germann Research Associate School of Informatics University of Edinburgh

hieuhoang commented 9 years ago

Allowing multiple decoder instances was how Moses was originally structured. However, that requires that all parameters required by every class be passed to it, ie. no global StaticData variable. Even though it's cleaner, it's a pain to program with since lots of classes need little bits of info so you end up passing your parameter class everywhere. The decoder was specifically changed to avoid this so we have a global StaticData.

It may be we have to revisit this topic again if we want to push Moses into more settings where it's used as a library inside a long end-to-end system. We shouldn't decide this on a whim since it does put some additional burden on future developers.

In the meantime, why does there have to be a separate executeable for the post-editing?

ugermann commented 9 years ago

Just in case anyone is still following this thread: issue fixed and checked in.

Hieu, this is your chance to integrate simulated post-editing into the moses executable with mininal effort.

On Fri, Nov 14, 2014 at 8:25 PM, Michael Denkowski <notifications@github.com

wrote:

It looks like moses-cmd/simulate-pe.cc needs to be updated with all the changes from the moses-cmd and IOWrapper merges. Right now moses doesn't compile with --with-mm.

— Reply to this email directly or view it on GitHub https://github.com/moses-smt/mosesdecoder/issues/82.

Ulrich Germann Research Associate School of Informatics University of Edinburgh

hieuhoang commented 9 years ago

cheers. I'll do it when i'm back in the burgh. I can't see what it's supposed to do. I'll ask u then

ugermann commented 9 years ago

"lots of classes need little bits of info so you end up passing your parameter class everywhere"

That's exactly what I call a whole graveyard full of global variables. These classes should be instantiated with a pointer to who they belong to, and then just ask mommy or daddy. And certain bits of info should be local not global. For example, the maximum phrase length should be determined by the phrase table, not externally. Also instead of keeping track of the number of feature values a feature function provides OUTSIDE the feature function is not as good as just ASKING the feature function how many features it provides.

As for why simulate-pe is a separate executable, it's called R&D. You have an idea, you are dealing with a bizarre process of actually getting a decoder to run (including classes being defined in Main.cpp), you don't want to break things, so you take clone things and take them to the lab and try them out there first. If they work, you integrate them into the running system. Other people may use branch and merge for that, but every time I set up a development branch and tell people to use that instead of the master branch if they want to play with it, I am told by the moses code guru that everything should be put into the master branch at once. You can't have both, a single, stable master branch and ongoing shared code development.

On Sun, Nov 16, 2014 at 4:42 PM, Hieu Hoang notifications@github.com wrote:

Allowing multiple decoder instances was how Moses was originally structured. However, that requires that all parameters required by every class be passed to it, ie. no global StaticData variable. Even though it's cleaner, it's a pain to program with since lots of classes need little bits of info so you end up passing your parameter class everywhere. The decoder was specifically changed to avoid this.

It may be we have to revisit this topic again if we want to push Moses into more settings where it's used as a library inside a long end-to-end system. We shouldn't decide this on a whim since it does put some additional burden on future developers.

In the meantime, why does there have to be a separate executeable for the post-editing?

— Reply to this email directly or view it on GitHub https://github.com/moses-smt/mosesdecoder/issues/82#issuecomment-63226544 .

Ulrich Germann Research Associate School of Informatics University of Edinburgh

ugermann commented 9 years ago

Repeat until done:

  1. Read input sentence.
  2. Translate input sentence.
  3. Output translation.
  4. Update phrase table with reference translation and word alignment.

On Sun, Nov 16, 2014 at 4:46 PM, Hieu Hoang notifications@github.com wrote:

cheers. I'll do it when i'm back in the burgh. I can't see what it's supposed to do. I'll ask u then

— Reply to this email directly or view it on GitHub https://github.com/moses-smt/mosesdecoder/issues/82#issuecomment-63226734 .

Ulrich Germann Research Associate School of Informatics University of Edinburgh

hieuhoang commented 9 years ago

simulated-pe has been rolled into the normal moses-cmd. Give it a whirl.

If no-one complains, i'll delete simulated-pe in a few days

hieuhoang commented 9 years ago

simulated-pe is gonna be deleted in the next push. if there are no problems using the spe args in moses-cmd, i'm gonna close this thread

hieuhoang commented 9 years ago

closing. simulated-pe n'exist plus