LanguageMachines / frog

Frog is an integration of memory-based natural language processing (NLP) modules developed for Dutch. All NLP modules are based on Timbl, the Tilburg memory-based learning software package.
https://languagemachines.github.io/frog
GNU General Public License v3.0
73 stars 11 forks source link

MWU output when no Parser is selected #89

Closed kosloot closed 4 years ago

kosloot commented 4 years ago

@Irishx suggested:

The default setting of Frog is to place mwu on 1 line as 1 token while this is actually only needed for the parser, even if you use the skip option to exclude parsing. Perhaps we should change this default setting?

@Irishx do you intend to disable MWU detection too, when the Parser is skipped?

This is easy to implement, but might change outcomes of older scripts. I am not sure if that would be a problem.

proycon commented 4 years ago

In principle I agree that having MWU disabled by default is probably best (unless the parser is needed). But I'm also not sure whether this is smart for backward-compatibility now.

Irishx commented 4 years ago

to what extent is it still backwards compatible on other points? or is that already broken and does it not matter?

proycon commented 4 years ago

It should be pretty backward compatible on most other points. The columned output format changed a bit over the times (extra columns) and the API probably had some changes over time (but hardly anyone except python-frog uses that directly). I guess the command line interface is still quite backward-compatible.

kosloot commented 4 years ago

Well, using frog with --skip=p and not --skip=mp wasn't a good idea in the first place, but we advised people to use it like that if they didn't need the parser. I don't think it is a big issue to make --skip=p imply --skip=m too, but as we do not have a --enable=... option or such, it would be difficult for the odd user to get the old behaviour then.

So in hindsight it is maybe better to not include this in the upcoming release and add an --enable option in the next release. Which directly raises questions like 'if i use --enable=m does that mean skip all the rest?' Which of course is not feasible for a lot of choices.

ARGL

kosloot commented 4 years ago

Or we just accept that there is some user somewhere who gets an anticipointment.

proycon commented 4 years ago

I don't think it is a big issue to make --skip=p imply --skip=m too Or we just accept that there is some user somewhere who gets an anticipointment.

Yeah, perhaps we should just do it

kosloot commented 4 years ago

Ok, I implemented this, but also added an --OLDMWU option to get the previous behaviour