second-state / WasmEdge-WASINN-examples

Apache License 2.0
237 stars 39 forks source link

Attention layer for transformer architectures? #19

Closed AnonymousAmalgrams closed 1 year ago

AnonymousAmalgrams commented 1 year ago

Hi, I was wondering if any suggestions could be given with regards to implementing transformer architectures. As far as I can tell, the wasi_nn proposal seems to have originally been just intended for neural networks with a particular structure, while now slightly different but new things like unets and etc have become widespread in the image classification/segmentation field and transformers have become much more favored with the text processing end. I am currently working on running a llama transformer model through wasmedge and it unfortunately doesn't seem to be quite as simple as just loading the .bin model file in unless I have made an unrecognized but egregious error someplace else. My thinking is that I may have to somehow split out the rnn stacks and run the attention layer separately for each then chain it in, but I would like to avoid this if possible as I have had no rust programming experience prior to this project and don't really know anything about frameworks for doing so (the model file itself is also several GB with dozens of layers so it would potentially be quite difficult for me to deal with). If it isn't too much trouble, any advice on this would be greatly appreciated.

AnonymousAmalgrams commented 1 year ago

Nevermind, issue was something different.