Open bittlingmayer opened 2 years ago
@zouharvi @randyscansani
Technically PBMT and NMT are both SMT, as opposed to RBMT. When people talk about SMT, they usually (citation needed) mean PBMT. That said, I had a class SMT which was equally about PBMT and NMT.
I think it's still interesting (especially because it gave us word-alignment etc) but I don't know if it still has relevance in the industry? Maybe something with phrase-level translation memories?
The Google Translate API for Trados 2019 allows you to choose the system you want to use: PBMT or NMT.
As @zouharvi mentions, this is something that can be found in the translation industry still to this day. For our translators' audience, I think it may be worth to further describe PBMT. Maybe not in an article of its own (perhaps redirecting can be possible?), but inside the SMT article.
I agree that PBMT can still be relevant. It's probably not used extensively in the industry, but so are some other topics in the roadmap (like unsupervised MT). In general, anyone who wants to know about MT and has already read articles/manuals on the topic has probably stumbled upon the PBMT acronym, so it might be worth adding a few words about it in my opinion.
Google Cloud just uses "PBMT" to mean the last generation of SMT, e.g. https://stackoverflow.com/a/63787121/4486860.
Seeing this discussion, maybe this is a call for us to have a disambiguating paragraph on RBMT, SMT, PBMT & NMT.
@bittlingmayer @cefoo @randyscansani Was this issue resolved?
@zouharvi We have a very short list describing the different approaches to SMT in the SMT article. Perhaps it's a good basis for expanding on the topic.
My concern is that people searching for PBMT will not find our resources, even though we cover it.
I think we can help a bit by modifying the Approaches text, from
By the 2010s, the top systems, like Google Translate, used statistical machine translation. By the 2020s, the top systems used neural machine translation.
to
By the 2010s, the top systems, like Google Translate, used statistical machine translation (SMT), specifically phrase-based machine translation (PBMT).
By the 2020s, the top systems used neural machine translation (NMT).
If we really want to be correct about NMT being a type of SMT, we can do something like this which doesn't contradict that:
By the 2010s, the top systems, like Google Translate, used statistical machine translation (SMT).
By the 2020s, the top systems moved from phrase-based machine translation (PBMT) to neural machine translation (NMT).
These are often used synonymously, because PBMT was the last major wave of SMT.
Does it deserve an article? Or a section in SMT? Or to be treated as a synonym?