Open BruceWang1996 opened 5 months ago
+1,same problem
add some code ,it works well,good luck
1.add import
import torch_npu
from torch_npu.contrib import transfer_to_npu
2.add jit in main function
if __name__ == "__main__":
use_jit_compile = os.getenv('JIT_COMPILE', 'False').lower() in ['true', '1']
torch.npu.set_compile_mode(jit_compile=use_jit_compile)
main(args)
Whether this project is only adapted to the Ascend NPU 910B chip? I am trying to run fastchat vicuna-7b-v1.5 on Ascend NPU 910A chip, but the inference speed very very very slow, almost 5min/answer.