Add prepareBTranspsoed - Githubissues

browsermt / marian-dev

Fast Neural Machine Translation in C++ - development repository

https://marian-nmt.github.io

Other

20 stars 7 forks source link

Add prepareBTranspsoed #67

Closed XapaJIaMnu closed 2 years ago

XapaJIaMnu commented 2 years ago

Apologies for the delay @jerinphilip , @abhi-agg . I have implemented the prepareBtransposed codepath and tested it for the non-wasm codepath. Could you test it for the wasm codepath if you have it set up somewhere?

Thanks

Nick

Checklist

[x] I have tested the code manually
[ ] I have run regression tests
[x] I have read and followed CONTRIBUTING.md
[ ] I have updated CHANGELOG.md

XapaJIaMnu commented 2 years ago

Can you test it first?

jerinphilip commented 2 years ago

Please treat my review as a no-objection. Here's the diff which gets this working locally on my WASM page.

diff --git a/src/tensors/cpu/wasm_intgemm_fallback.cpp b/src/tensors/cpu/wasm_intgemm_fallback.cpp
index ec7be368..ffc159f5 100644
--- a/src/tensors/cpu/wasm_intgemm_fallback.cpp
+++ b/src/tensors/cpu/wasm_intgemm_fallback.cpp
@@ -44,7 +44,12 @@ extern "C" void int8PrepareBFromTransposedFallback(const float* input_B_transpos
                                                    Index width,
                                                    Index cols_B,
                                                    int8_t* output) {
-  ABORT("Unimplemented int8PrepareBFromTransposedFallback");
+  intgemm::Int8::PrepareBTransposed(input_B_transposed,
+                                    output,
+                                    scale,
+                                    width,
+                                    cols_B);
+
 }

This appears to be (width, colsB) rather than (colsB, width), which I'm using currently at https://github.com/jerinphilip/arm-playground/pull/12, but it's on me to fix that I suppose...

XapaJIaMnu commented 2 years ago

I think it should be working now.

abhi-agg commented 2 years ago

@XapaJIaMnu I am testing it now. Btw, how can I make sure that it is the transposed code path that I am testing?

XapaJIaMnu commented 2 years ago

@XapaJIaMnu I am testing it now. Btw, how can I make sure that it is the transposed code path that I am testing?

If you are using this branch, i've removed the hacks that call tranpose and then prepareB when necessary. You can also check which function is being executed with a simple print statement if you doubt it.

abhi-agg commented 2 years ago

@XapaJIaMnu I am testing it now. Btw, how can I make sure that it is the transposed code path that I am testing?

If you are using this branch, i've removed the hacks that call tranpose and then prepareB when necessary. You can also check which function is being executed with a simple print statement if you doubt it.

I see only the transposed code path being taken and only prepareBTransposed being called in the logs. The wasm test page works on my end.

Out of curiosity, I don't see prepareB being called at all now. Is it a dead path for most of the trained models? Do you know of any model that might invoke prepareB path as well?

XapaJIaMnu commented 2 years ago

PrepareB is called during model loading time, not during translation.

abhi-agg commented 2 years ago

PrepareB is called during model loading time, not during translation.

But I don't see it being called ever. Model loading is always invoking PrepareBQuantizedTransposed and inference is invoking only PrepareBTransposed (although just once) at my end.

XapaJIaMnu commented 2 years ago

I see. If you submit a .npz file and tell it to be transformed into intgemm on the fly you will get PrepareB called, but I guess we don't do that as we only have binary formats.