Open jorgtied opened 2 years ago
The cryptographic hash not matching is likely #79 which is fixed, just requires a redeploy of the online models. I will try to fix this today.
TranslateLocally can recognise models in directories in the same folder as the executable. Try manually extracting the model to see if it works.
I just tested it, our models work when downloaded manually, the download issue is just due to #79. Would you like to send us a model that doesn't work?
I didn't try this yet with your build but on my fork the following model produces non-sense on Windows but works well with the Mac OS build (English-to-Finnish) https://object.pouta.csc.fi/OPUS-MT-models/app/models/eng-fin.tatoeba.tiny.tar.gz (the same happens with this one: https://object.pouta.csc.fi/OPUS-MT-models/app/models/swe-fin.transformer-tiny11.tar.gz (Swedish to Finnish))
Another symptom, probably related: on Mac OS (and Windows as well) when I switch from one of TranslateLocally's own models to the English->Finnish model, it crashes. Same when I switch from the English->Finnish model to another model.
Error when switching to the model:
* thread #34, stop reason = signal SIGABRT
* frame #0: 0x00007ff81e5cb112 libsystem_kernel.dylib`__pthread_kill + 10
frame #1: 0x00007ff81e601214 libsystem_pthread.dylib`pthread_kill + 263
frame #2: 0x00007ff81e54dd10 libsystem_c.dylib`abort + 123
frame #3: 0x00007ff81e5be0b2 libc++abi.dylib`abort_message + 241
frame #4: 0x00007ff81e5bd4fd libc++abi.dylib`std::__terminate(void (*)()) + 46
frame #5: 0x00007ff81e5bfd55 libc++abi.dylib`__cxxabiv1::failed_throw(__cxxabiv1::__cxa_exception*) + 27
frame #6: 0x00007ff81e5bfd1c libc++abi.dylib`__cxa_throw + 116
frame #7: 0x000000010038ce88 translateLocally`marian::cpu::integer::fetchAlphaFromModelNodeOp::forwardOps()::'lambda'()::operator()() const + 1768
frame #8: 0x00000001003c7a1f translateLocally`marian::rnn::GRUFastNodeOp::runBackward(std::__1::vector<std::__1::function<void ()>, std::__1::allocator<std::__1::function<void ()> > > const&) + 47
frame #9: 0x00000001003c5217 translateLocally`marian::Node::forward() + 71
frame #10: 0x00000001002d15d9 translateLocally`marian::ExpressionGraph::forward(std::__1::list<IntrusivePtr<marian::Chainable<IntrusivePtr<marian::TensorBase> > >, std::__1::allocator<IntrusivePtr<marian::Chainable<IntrusivePtr<marian::TensorBase> > > > >&, bool) + 201
frame #11: 0x00000001002d1495 translateLocally`marian::ExpressionGraph::forwardNext() + 997
frame #12: 0x00000001004ee32f translateLocally`marian::BeamSearch::search(std::__1::shared_ptr<marian::ExpressionGraph>, std::__1::shared_ptr<marian::data::CorpusBatch>) + 10959
frame #13: 0x0000000100103444 translateLocally`marian::bergamot::TranslationModel::translateBatch(unsigned long, marian::bergamot::Batch&) + 308
frame #14: 0x0000000100133323 translateLocally`void* std::__1::__thread_proxy<std::__1::tuple<std::__1::unique_ptr<std::__1::__thread_struct, std::__1::default_delete<std::__1::__thread_struct> >, marian::bergamot::AsyncService::AsyncService(marian::bergamot::AsyncService::Config const&)::$_2> >(void*) + 115
frame #15: 0x00007ff81e6014f4 libsystem_pthread.dylib`_pthread_start + 125
frame #16: 0x00007ff81e5fd00f libsystem_pthread.dylib`thread_start + 15
Error when switching away from the model:
* thread #37, stop reason = signal SIGABRT
* frame #0: 0x00007ff81e5cb112 libsystem_kernel.dylib`__pthread_kill + 10
frame #1: 0x00007ff81e601214 libsystem_pthread.dylib`pthread_kill + 263
frame #2: 0x00007ff81e54dd10 libsystem_c.dylib`abort + 123
frame #3: 0x00007ff81e5be0b2 libc++abi.dylib`abort_message + 241
frame #4: 0x00007ff81e5bd4fd libc++abi.dylib`std::__terminate(void (*)()) + 46
frame #5: 0x00007ff81e5bfd55 libc++abi.dylib`__cxxabiv1::failed_throw(__cxxabiv1::__cxa_exception*) + 27
frame #6: 0x00007ff81e5bfd1c libc++abi.dylib`__cxa_throw + 116
frame #7: 0x0000000100394e05 translateLocally`marian::cpu::integer::PrepareBiasForBNodeOp::PrepareBiasForBNodeOp(IntrusivePtr<marian::Chainable<IntrusivePtr<marian::TensorBase> > >, IntrusivePtr<marian::Chainable<IntrusivePtr<marian::TensorBase> > >, IntrusivePtr<marian::Chainable<IntrusivePtr<marian::TensorBase> > >, IntrusivePtr<marian::Chainable<IntrusivePtr<marian::TensorBase> > >) + 2293
frame #8: 0x000000010038bd8b translateLocally`IntrusivePtr<marian::Chainable<IntrusivePtr<marian::TensorBase> > > marian::Expression<marian::cpu::integer::PrepareBiasForBNodeOp, IntrusivePtr<marian::Chainable<IntrusivePtr<marian::TensorBase> > >&, IntrusivePtr<marian::Chainable<IntrusivePtr<marian::TensorBase> > >&, IntrusivePtr<marian::Chainable<IntrusivePtr<marian::TensorBase> > >&, IntrusivePtr<marian::Chainable<IntrusivePtr<marian::TensorBase> > >&>(IntrusivePtr<marian::Chainable<IntrusivePtr<marian::TensorBase> > >&, IntrusivePtr<marian::Chainable<IntrusivePtr<marian::TensorBase> > >&, IntrusivePtr<marian::Chainable<IntrusivePtr<marian::TensorBase> > >&, IntrusivePtr<marian::Chainable<IntrusivePtr<marian::TensorBase> > >&) + 139
frame #9: 0x000000010030d8d8 translateLocally`IntrusivePtr<marian::Chainable<IntrusivePtr<marian::TensorBase> > > marian::cpu::integer::affine<(marian::Type)257>(IntrusivePtr<marian::Chainable<IntrusivePtr<marian::TensorBase> > >, IntrusivePtr<marian::Chainable<IntrusivePtr<marian::TensorBase> > >, IntrusivePtr<marian::Chainable<IntrusivePtr<marian::TensorBase> > >, bool, bool, float, float, bool) + 1016
frame #10: 0x000000010030be0e translateLocally`marian::affine(IntrusivePtr<marian::Chainable<IntrusivePtr<marian::TensorBase> > >, IntrusivePtr<marian::Chainable<IntrusivePtr<marian::TensorBase> > >, IntrusivePtr<marian::Chainable<IntrusivePtr<marian::TensorBase> > >, bool, bool, float) + 1038
frame #11: 0x0000000100403132 translateLocally`marian::mlp::Output::applyAsLogits(IntrusivePtr<marian::Chainable<IntrusivePtr<marian::TensorBase> > >)::$_0::operator()(IntrusivePtr<marian::Chainable<IntrusivePtr<marian::TensorBase> > >, IntrusivePtr<marian::Chainable<IntrusivePtr<marian::TensorBase> > >, IntrusivePtr<marian::Chainable<IntrusivePtr<marian::TensorBase> > >, bool, bool) const + 82
frame #12: 0x0000000100402995 translateLocally`marian::mlp::Output::applyAsLogits(IntrusivePtr<marian::Chainable<IntrusivePtr<marian::TensorBase> > >)::$_1::operator()(IntrusivePtr<marian::Chainable<IntrusivePtr<marian::TensorBase> > >, IntrusivePtr<marian::Chainable<IntrusivePtr<marian::TensorBase> > >, IntrusivePtr<marian::Chainable<IntrusivePtr<marian::TensorBase> > >, bool, bool) const + 261
frame #13: 0x00000001003fe625 translateLocally`marian::mlp::Output::applyAsLogits(IntrusivePtr<marian::Chainable<IntrusivePtr<marian::TensorBase> > >) + 14597
frame #14: 0x00000001004bc340 translateLocally`marian::DecoderTransformer::step(std::__1::shared_ptr<marian::DecoderState>) + 9424
frame #15: 0x00000001004b7c7d translateLocally`marian::DecoderTransformer::step(std::__1::shared_ptr<marian::ExpressionGraph>, std::__1::shared_ptr<marian::DecoderState>) + 109
frame #16: 0x00000001004e0e80 translateLocally`marian::EncoderDecoder::step(std::__1::shared_ptr<marian::ExpressionGraph>, std::__1::shared_ptr<marian::DecoderState>, std::__1::vector<unsigned int, std::__1::allocator<unsigned int> > const&, std::__1::vector<marian::Word, std::__1::allocator<marian::Word> > const&, std::__1::vector<unsigned int, std::__1::allocator<unsigned int> > const&, int) + 480
frame #17: 0x00000001004cee1d translateLocally`marian::models::Stepwise::step(std::__1::shared_ptr<marian::ExpressionGraph>, std::__1::shared_ptr<marian::DecoderState>, std::__1::vector<unsigned int, std::__1::allocator<unsigned int> > const&, std::__1::vector<marian::Word, std::__1::allocator<marian::Word> > const&, std::__1::vector<unsigned int, std::__1::allocator<unsigned int> > const&, int) + 109
frame #18: 0x0000000100507135 translateLocally`marian::ScorerWrapper::step(std::__1::shared_ptr<marian::ExpressionGraph>, std::__1::shared_ptr<marian::ScorerState>, std::__1::vector<unsigned int, std::__1::allocator<unsigned int> > const&, std::__1::vector<marian::Word, std::__1::allocator<marian::Word> > const&, std::__1::vector<unsigned int, std::__1::allocator<unsigned int> > const&, int) + 245
frame #19: 0x00000001004edbe8 translateLocally`marian::BeamSearch::search(std::__1::shared_ptr<marian::ExpressionGraph>, std::__1::shared_ptr<marian::data::CorpusBatch>) + 9096
frame #20: 0x0000000100103444 translateLocally`marian::bergamot::TranslationModel::translateBatch(unsigned long, marian::bergamot::Batch&) + 308
frame #21: 0x0000000100133323 translateLocally`void* std::__1::__thread_proxy<std::__1::tuple<std::__1::unique_ptr<std::__1::__thread_struct, std::__1::default_delete<std::__1::__thread_struct> >, marian::bergamot::AsyncService::AsyncService(marian::bergamot::AsyncService::Config const&)::$_2> >(void*) + 115
frame #22: 0x00007ff81e6014f4 libsystem_pthread.dylib`_pthread_start + 125
frame #23: 0x00007ff81e5fd00f libsystem_pthread.dylib`thread_start + 15
The model also doesn't really work for me: (Google Translate says that's "my value is anything here"?)
I was just about to open a new bug report with this. Same behaviour on Linux, the model crashes. I suspect this is due to the dynamic model swap framework that doesn't allow for models with different configurations to be changed. Upstream bug @jerinphilip ? Even when deleting all other models so that the English-Finnish is loaded first (and therefore there's no swapping issues), I still can't get it to translate anything right, even on Linux.
Looking at the model it has two separate vocabularies. I guess the new bergamot-translator might not support separate vocabularies, and this model has separate vocabularies. @jerinphilip ?
As for the problem of downloading models, the QT version used for windows has some bug, where it doesn't do well with redirection (eg http->https) We fixed that by making sure there are no redirects when downloading the model. This will eventually be fixed when the QT6 windows build starts working (vcpkg issues...)
Strange, I can perfectly translate with the English-Finnish model on my Mac laptop with that model and my own build from this fork: https://github.com/Helsinki-NLP/OPUS-MT-app/ Importing a manual download into translateLocally crashes ...
Even when deleting all other models so that the English-Finnish is loaded first (and therefore there's no swapping issues), I still can't get it to translate anything right, even on Linux.
Clean runs appear to work for me, for Swedish-Finnish. I can't read Finnish, but the output looks weird for English-Finnish, and potentially incorrect for the Finnish model. Not so sure about the other one.
From this notebook.
I'll try to look into the other situation (crashes on swapping models) alongside multiple model improvements, which I expect to take on soon.
when I switch from one of TranslateLocally's own models to the English->Finnish model, it crashes. Same when I switch from the English->Finnish model to another model.
Is translateLocally using the model-swap provided by upstream bergamot-translator now? I was under the impression translateLocally is restarting Service
as a whole. Does it work for swaps on browsermt provided models?
ARVO.ARVO!ARVO!ARVO!
ARVO KE<i>H</i>EN <i></i>KEHY JOKA KEH KEHY .ARVO <i></i><i></i>JOKA <i></i>.ARVO <i></i><i></i>JOKA .ARVO <i></i><i></i>JOKA .ARVO <i></i><i></i>JOKA JOKA .ARVO <i></i>
Symptoms above appear consistent with a wrong vocabulary keyed in. Trying to access something out of bounds with missing vocabulary could be causing the segfault?
@jerinphilip It seems I am wrong, this is not a bergamot-translator issue, the model doesn't work even with browsermt/marian-dev
@jorgtied could it be that this model was trained with a different marian fork?
@jorgtied it seems that changing gemm-precision: int8shift
to gemm-precision: int8
makes the model work. I will have to investigate, but this is a browsermt issue. Since your model doesn't have precomputed alphas, int8
should give you about the same performance as int8shift
.
OK - good to know. The models where trained with the original marian-dev but quantised with the browsermt branch of marian-dev. Is that a problem?
Training with original marian-dev and quantizing with browsermt should be fine and is the recommended path.
I'm working on it, hope to roll out a hotfix tonight.
Precomputing alphas makes only sense in connection with finetuned quantisation, right? Or is that also useful for non-tuned models?
Precomputing alphas makes only sense in connection with finetuned quantisation, right? Or is that also useful for non-tuned models?
These are orthogonal.
Precomputing alphas is just recording the typical range of values of activations and always using that scaling factor instead of setting the scaling factor on the fly. It will always damage quality somewhat, in return for not having to computing scaling at runtime.
Finetuning mucks with the floats in an attempt to limit damage from quantization though, as you have observed, sometimes it makes things worse. The finetuning happens with an emulated quantization (i.e. it uses floats, just with limited values) that I think always determines the scaling factor on the fly.
OK - understood. So, computing alphas will already help to avoid the model to crash even if I don't do finetuning now, right. I'll keep that in mind ... Additional question: does it make a difference to extract alphas with or without lexical shortlists?
@jorgtied could you share the original model.npz and the training configuration? I tested our models and they work with those config options, whereas yours refuses and I don't know why ;/
I think everything you need should be in here (besides of the data - would you need those as well?): https://object.pouta.csc.fi/OPUS-MT-models/swe-fin/opusTCv20210807+nopar+ft95-2022-01-19.zip (that's for the Swedish-Finnish model)
I can't access that link:
This XML file does not appear to have any style information associated with it. The document tree is shown below.
<Error>
<Code>NoSuchKey</Code>
<BucketName>OPUS-MT-models</BucketName>
<RequestId>tx00000000000000024e7c6-0061eea381-26ead993-allas-prod-kaj</RequestId>
<HostId>26ead993-allas-prod-kaj-cpouta-production</HostId>
</Error>
Sorry, this is the correct link: https://object.pouta.csc.fi/Tatoeba-MT-models/swe-fin/opusTCv20210807+nopar+ft95-2022-01-19.zip
Sorry, could you give me the English-Finnish one, it's easier to work with for us.
Also the Swedish-Finnish model doesn't exhibit the issue (i tried several parameter combinations such as gemm-precision: int8/int8shift/int8shiftAll
and it works with all of them). Only the English-Finnish is broken with int8shift.
I messed up my experiments and could not really find the original model anymore. Instead I created a fresh version and maybe we can simply verify that this one works? It would be this quantised version: https://object.pouta.csc.fi/OPUS-MT-models/app/models/eng-fin.transformer-tiny11.tar.gz and the original one is in https://object.pouta.csc.fi/Tatoeba-MT-models/eng-fin/opusTCv20210807+nopar+ft95-sepvoc_transformer-tiny11-align_2022-01-25.zip
I can no longer reproduce the crash with your newer model. It works with both int8
int8shift
and int8shiftAll
(didn't test alphas, as the model has no alphas as far as I understand.
The issue with the previous model was with the two input and output embedding matrices. Disabling the shifted codepath for them made the model work, but I have no idea what was wrong with them. At any rate, the new model doesn't exhibit this issue. Does it work for you in Windows?
No it doesn't have alphas but it also has two different vocabularies in source and target. But maybe there was something wrong with my spm files for the old model but the weird thing is that i could use it without problems on my local mac laptop. I don't have windows available but I can ask my daughter later today again to test on her machine. Yesterday, we tried the Swedish-Finnish model on her laptop with my fork of translateLocally and it still produced garbage (even when moving to int8). I'll try again and report later ...
I just managed to get to a windows machine and tested your models. They both produce crap on Windows and I'm puzzled... Will update you later.
I created another model with a joint vocabulary. Does this have the same problems on Windows? https://object.pouta.csc.fi/OPUS-MT-models/app/models/eng-fin.transformer-tiny11-jointvocab.tar.gz
Still broken on Windows.... Can you share training data and training script so we can try to reproduce it. Something is very weird.
Do we need to go all the way back to the training data??
Here is all the training and validation data for the last model with joint vocabularies: https://object.pouta.csc.fi/Tatoeba-MT-models/engfin-jointvocab.tar The training command is also part of the tarfile (in the logfile engfintrain-and-eval.out.960712
). Something that might be non-standard in my setup is that I segment the data outside of Marian and use regular vocab-files to invoke training (instead of using the in-built sentencepiece library). But that should not really cause this strange behavior, should it?
We did a quick test on a windows 10 machine and got a problem with downloading models. It says that the cryptographic hash does not match. We tried various language pairs. I also tried my own fork and our windows build and for that it does work to download the models but the translations are non-sense. It's just random output that has nothing to do with the input. Did anyone see a similar behavior? I also tested the builds for Mac OS and they work without problems and translations look reasonable.