Even having calling both "with torch.no_grad()" and "model.eval()", my model(fast-transformer) for the same batch input could get the different output. It is somethings like "y = model(x), then z = model(x) ", then i find y and z not same. If I want to get the same result in my inference part, what ese api of fast-transformers should I use prior to "y=model(x)" ? Thanks
Even having calling both "with torch.no_grad()" and "model.eval()", my model(fast-transformer) for the same batch input could get the different output. It is somethings like "y = model(x), then z = model(x) ", then i find y and z not same. If I want to get the same result in my inference part, what ese api of fast-transformers should I use prior to "y=model(x)" ? Thanks