HuskyInSalt / CRAG

Corrective Retrieval Augmented Generation
298 stars 29 forks source link

CRAG_inference.py #19

Open zsyggg opened 3 months ago

zsyggg commented 3 months ago

Hello, author, when I execute run_crag_inference.sh, the error shows that the single card graphics memory is insufficient, but the multi-card operation seems to be not set in the original code, right?

[rank0]: torch.cuda.OutOfMemoryError: CUDA out of memory. Tried to allocate 268.00 MiB. GPU

HuskyInSalt commented 2 months ago

Hi! generator = LLM(model=args.generator_path, dtype="half") utilized single GPU in our code, you can add the parameter tensor_parallel_size for multiple-GPU inference.