Inference speed worse on AMD CPU than on Intel CPU

amazon-science / chronos-forecasting

Chronos: Pretrained (Language) Models for Probabilistic Time Series Forecasting

Apache License 2.0

2.02k stars 238 forks source link

i test chronos with intel core cpu(mac pro), linux with intel cpu(server), and linux with amd(server) on same code. it seems amd cpu has ~30x worse in inference time.

in intel cpu it approximate cost 0.7s with batch_num = 1, predict_len = 1, context_len = 70. however in AMD, it about 30s.

i don't know it's my specific case. but i found some one said turn on AMP in AMD CPU by using auto_cast to bfloat16 would case decresing performance. Bfloat16 CPU inference speed is too slow on AMD cpu

i'm quite a newbie in torch. so if someone find a solution, please post here. thx

import pandas as pd # requires: pip install pandas import torch from chronos import ChronosPipeline pipeline = ChronosPipeline.from_pretrained( "amazon/chronos-t5-small", device_map="cuda", # use "cpu" for CPU inference and "mps" for Apple Silicon torch_dtype=torch.float32, )

amazon-science / chronos-forecasting

Inference speed worse on AMD CPU than on Intel CPU #83