browsermt / bergamot-translator

Cross platform C++ library focusing on optimized machine translation on the consumer-grade device.
http://browser.mt
Mozilla Public License 2.0
327 stars 36 forks source link

Terminology #448

Open XapaJIaMnu opened 1 year ago

XapaJIaMnu commented 1 year ago

Basic Python interface + installation instructions

jelmervdl commented 1 year ago

Wouldn't it be much simpler to make the terminology map a member of the ResponseOptions class, and map it to a dict? Then you can change it per request, and you need less plumbing and no file IO for passing on that info.

I don't know if ResponseOptions is an immutable struct at this point. I don't think we have an async API on the python side so there's no risk of Python changing the terminology during a translation. But if you do want to implement that, its something to take into account.

jelmervdl commented 1 year ago

Still on my wish list, but not a priority for a first release:

  1. Move terminology config to the ResponseOptions struct. This allows it to be configured per request, and matches other options like alignment and HTML. I still want to do the same with the Cache as well some day. I briefly looked into how hard this would be to do, but the way the C++ demo app parses command line options doesn't help.
  2. Do terminology replacement after HTML parsing. Right now it happens before, which could invalidate the HTML and will mess up things if HTML-like tags are used for terminology. I looked at moving it to after HTML parsing, but that needs a bit more effort. The HTML parsing tracks byte offsets of the text it outputs, and expects the text that is translated (and for which we get alignment info after translation) is exactly the same as the text it produced. If any terminology replacements are added after HTML, but before translation, this doesn't hold. There is a data structure, AnnotatedText, that might be extended to do this task of tracking updated mappings. Using something like this was always the plan for HTML, and having a more expendable more onion-like structure, but we never got around to refactoring it.
  3. Add pre/post processing that makes sure the terminology formatting cannot occur in the input string by replacing it with a placeholder, or doing something like the HTML transfer to copy it to the output.
XapaJIaMnu commented 1 year ago

Before we push to main i'd like to fix some of the CI (as building the wheel is broken atm, onnxgemm requires some updating etc). @kpu how does the interface look to you?

jelmervdl commented 1 year ago

Todo: instead of a --terminology-form cli option, it should go in the model yaml.

graemenail commented 1 year ago

Todo: instead of a --terminology-form cli option, it should go in the model yaml.

Should we nest application specific augments e.g.

bergamot-translator-app:
  terminology: terms.tsv
  terminology-form: "__source__ {src} __target {trg} __done__"

to avoid any future name conflicts with upstream Marian

jelmervdl commented 1 year ago

Don't know if Marian's yaml parser can handle that, but if we can do that, it might be a good future-proofing.

Ideally there would be no collisions because this would be the marian implementation but maybe that's slightly unrealistic.

graemenail commented 1 year ago

I would guess the parser library supports it, but from a quick look it's not wired up on the Marian side. But ideally Marian wouldn't need to parse it, just ignore it since terminology (for now?) is just implemented in bergamot-translator.

But on the ignoring unknown args:

./marian -c tmp.yml 
[2023-07-10 17:59:32] Error: There are option(s) in a config file that are not expected: bergamot-translator-app

(and similarly directly on the cli)

XapaJIaMnu commented 1 year ago

Now GPU support is working. Reliably from the C++ test app, not so much from python.

The python interface randomly deadlocks, and I have no idea why. I didn't have this problem before so I blame the ensemble change, but i haven't bisected, or anything. FML.

The deadlocks happen even when compiling without CUDA compiled in. I guess I should be attaching GDB...

XapaJIaMnu commented 1 year ago

This is where we are at:

0x00007f2a3dd192ed in syscall () from /usr/lib/libc.so.6
(gdb) up
#1  0x00007f2a1a8de76a in std::__atomic_futex_unsigned_base::_M_futex_wait_until (this=<optimized out>, __addr=0x55d7e9141070, __val=2147483648, __has_timeout=<optimized out>, __s=..., __ns=...)
    at /usr/src/debug/gcc/gcc/libstdc++-v3/src/c++11/futex.cc:122
Downloading source file /usr/src/debug/gcc/gcc/libstdc++-v3/src/c++11/futex.cc
122     /usr/src/debug/gcc/gcc/libstdc++-v3/src/c++11/futex.cc: Directory not empty.                                                                                                                                    
(gdb) up
#2  0x00007f2a384e4b71 in std::__atomic_futex_unsigned<2147483648u>::_M_load_and_test_until (this=0x55d7e9141070, __assumed=0, __operand=1, __equal=true, __mo=std::memory_order_acquire, __has_timeout=false, 
    __s=std::chrono::duration = { 0s }, __ns=std::chrono::duration = { 0ns }) at /usr/lib/gcc/x86_64-pc-linux-gnu/12.3.0/include/c++/bits/atomic_futex.h:109
109               bool __ret = _M_futex_wait_until((unsigned*)(void*)&_M_data,
(gdb) up
#3  0x00007f2a384cb2ba in std::__atomic_futex_unsigned<2147483648u>::_M_load_and_test (this=0x55d7e9141070, __assumed=0, __operand=1, __equal=true, __mo=std::memory_order_acquire)
    at /usr/lib/gcc/x86_64-pc-linux-gnu/12.3.0/include/c++/bits/atomic_futex.h:158
158           return _M_load_and_test_until(__assumed, __operand, __equal, __mo,
(gdb) up
#4  0x00007f2a384a1ef8 in std::__atomic_futex_unsigned<2147483648u>::_M_load_when_equal (__mo=std::memory_order_acquire, __val=1, this=0x55d7e9141070)
    at /usr/lib/gcc/x86_64-pc-linux-gnu/12.3.0/include/c++/bits/atomic_futex.h:212
212           _M_load_and_test(__i, __val, true, __mo);
(gdb) up
#5  std::__future_base::_State_baseV2::wait (this=0x55d7e9141060) at /usr/lib/gcc/x86_64-pc-linux-gnu/12.3.0/include/c++/future:351
351             _M_status._M_load_when_equal(_Status::__ready, memory_order_acquire);
(gdb) up
#6  0x00007f2a384b4204 in std::__basic_future<marian::bergamot::Response>::wait (this=0x55d7e8790800) at /usr/lib/gcc/x86_64-pc-linux-gnu/12.3.0/include/c++/future:714
714             _M_state->wait();
(gdb) up
#7  0x00007f2a384a31b6 in ServicePyAdapter::translate (this=0x55d7e877a1e0, model=std::shared_ptr<marian::bergamot::TranslationModel> (use count 5, weak count 0) = {...}, 
    inputs=std::vector of length 10, capacity 10 = {...}, options=...) at /home/dheart/uni_stuff/postdoc/bergamot-translator/bindings/python/bergamot.cpp:82
82            futures[i].wait();

The code in question is here: https://github.com/browsermt/bergamot-translator/blob/1e80e799fd91573f6dfa99524ff7f2c065294a58/bindings/python/bergamot.cpp#L116

What's happening? For some reason a future just takes forever while no computation is happening...

XapaJIaMnu commented 1 year ago

Reverting: git revert 4b0da8d434e5a688139255873afd177f647ef777 fixes the issue. @graemenail not sure why. Blaming: 4b0da8d434e5a688139255873afd177f647ef777 On the bright side, python gpu works on a 3090 with a modern CUDA.

XapaJIaMnu commented 1 year ago

How to compile the python interface with CUDA support:

export COMPILE_CUDA=ON 
export CMAKE_BUILD_PARALLEL_LEVEL=16
pip install .

Invoke as:

bergamot-translator -c model.npz.best-bleu.npz.decoder.brg.yml -i dataset -n 0 -g 0 -l info