Torch compile for faster inferencing?

sillsdev / machine.py

Machine is a natural language processing library for Python that is focused on providing tools for processing resource-poor languages.

MIT License

10 stars 2 forks source link

Torch compile for faster inferencing? #72

Closed johnml1135 closed 9 months ago

johnml1135 commented 11 months ago

"Depending on the model and the GPU, torch.compile() yields up to 30% speed-up during inference. To use torch.compile(), simply install any version of torch above 2.0."

model = AutoModelForImageClassification.from_pretrained(MODEL_ID).to("cuda")
+ model = torch.compile(model)

Can we do this? What is the drawback?

johnml1135 commented 9 months ago

@isaac091 - can you put in any findings here - why it may or may not work?

isaac091 commented 9 months ago

Sure. I haven't tried it on just inferencing yet, but I will run those tests.

isaac091 commented 9 months ago

Results documented in https://github.com/sillsdev/silnlp/pull/308