lfsszd / CS-Drafting

Cascade Speculative Drafting
26 stars 2 forks source link

Tensor parallelism #1

Open lethean1 opened 1 month ago

lethean1 commented 1 month ago

I want to use tensor parallelism with CS-drafting, but I do not find the config to start the tensor parallel, can you give me an example?

lfsszd commented 1 month ago

CS Drafting is built upon the Huggingface transformer library which handles inference parallel. You can try specifying a device_map when loading the model before passing it to the CountedCSDraftingDecoderModelKVCache{:.ruby} class.

lethean1 commented 1 month ago

It seems that specifying device_map can only support pipeline parallel?