Open samir-souza opened 8 months ago
@JingyaHuang any idea why?
@michaelbenayoun I added disable_validation
to CLI as there were users who don't have access to inf2.8xlarge (only had inf2.xlarge and got OOM) and wanted to compile on a purely CPU instance. But for the modeling APIs, these are designed to run inference (although it allows export via the class), so I assumed whoever uses the APIs has access to inf2... That's why it was not added to the class.
I think it should be the only argument supported in CLI but not the modeling API (except for atol, we don't do validation in the modeling API). Is there any particular use case that need these args in the modeling? @samir-souza
@JingyaHuang customers using Inferentia1, split the deployment step into 2 parts; 1/ compilation; 2/ execution. They compile their models on CPU (C5 instances) and there's no need to validate this step (even if they try to validate it will fail and break the solution). They do that using a SageMaker job before deploying the model to an inf1 instance. That's why it is important to have disable_validation and eventually other features in the API. By now, they are launching an optimum-cli process using Python due to this limitation, but this is not ideal.
I see, thanks for the explanation @samir-souza. So far in the modeling API, we assume that compiled models need to be loaded once the compilation is completed. Functions like save_pretrained
won't work unless _from_pretrained
and __init__
are called. I will check how I can support disable_validation
in the modeling class, a refactoring might be needed (I am focusing on supporting other tasks, need to find bandwidth for that).
Also @philschmid, you are more familiar with sagemaker workflow, do you use the modeling API for export on non inferentia instances?
This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread. Thank you!
This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread. Thank you!
Is it still up-to-date? Can we close this?
This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread. Thank you!
This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread. Thank you!
This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread. Thank you!
This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread. Thank you!
When compiling models using optimum-cli, it supports many input parameters that are not supported by the Python Wrappers, for instance:
When using optimum-cli, you can use parameters like --disable-validation
However, when using the Python Wrapper, this param (and others) are not supported:
disable_validation is ignored and it loads the model anyway.
Could you fix that and also double check if all params are supported by the wrapper, please?