Powering AWS purpose-built machine learning chips. Blazing fast and cost effective, natively integrated into PyTorch and TensorFlow and integrated with your favorite AWS services
On my local machine, the models take about 3G of RAM. On inferentia I'm running a inf1.xlarge 7.5GB of RAM available, so I guess it's not a RAM issue here.
I currently have 4 models and as suggested in https://github.com/aws/aws-neuron-sdk/issues/441, I've put everything in a single handler.
The
initialize
method looks like thisWhen launching the
.mar
locally (with jit files), everything runs fine. But when launching it on inferentia, I get:On my local machine, the models take about
3G
of RAM. On inferentia I'm running ainf1.xlarge
7.5GB of RAM available, so I guess it's not a RAM issue here.I've built my models with
Am I doing something wrong, here ? Thanks