HuskyInSalt / CRAG

Corrective Retrieval Augmented Generation
291 stars 27 forks source link

CRAG_inference.py #19

Open zsyggg opened 2 months ago

zsyggg commented 2 months ago

Hello, author, when I execute run_crag_inference.sh, the error shows that the single card graphics memory is insufficient, but the multi-card operation seems to be not set in the original code, right?

[rank0]: torch.cuda.OutOfMemoryError: CUDA out of memory. Tried to allocate 268.00 MiB. GPU

HuskyInSalt commented 2 months ago

Hi! generator = LLM(model=args.generator_path, dtype="half") utilized single GPU in our code, you can add the parameter tensor_parallel_size for multiple-GPU inference.