I am trying to use Turbo Transformer for inferencing on a trained BERT Transformers(Fastai with HuggingFace).
I followed the steps mentioned under the section : 'How to customised your post-processing layers after BERT encoder' from the file' and customised the bert_for_sequence_classification_example.py.
It appears that the time taken for inferencing for _Turbo is greater than Fast AI!_
Here is the screenshot of the inferencing time for a simple Sentiment Prediction task for the below statement :
'@AmericanAir @united contact me, or do something to alleviate this terrible, terrible service. But no, your 22 year old social media guru'
In comparison to Fast AI :
Has anyone experienced something similar? I might be missing out on something causing this result.
Or would it only make sense to compare the timings on larger test data?
Let me make sure you are using CPU for inference and your turbo version is 0.4.1.
Generally, the first inference after runtime launched is very slow, you need to warm up the engine with one initial dummy inference.
I am trying to use Turbo Transformer for inferencing on a trained BERT Transformers(Fastai with HuggingFace). I followed the steps mentioned under the section : 'How to customised your post-processing layers after BERT encoder' from the file' and customised the bert_for_sequence_classification_example.py.
It appears that the time taken for inferencing for _Turbo is greater than Fast AI!_
Here is the screenshot of the inferencing time for a simple Sentiment Prediction task for the below statement : '@AmericanAir @united contact me, or do something to alleviate this terrible, terrible service. But no, your 22 year old social media guru'
In comparison to Fast AI :
Has anyone experienced something similar? I might be missing out on something causing this result. Or would it only make sense to compare the timings on larger test data?