Closed ari-vedant-jain closed 2 weeks ago
You need to specify at least a top_k or top_p when sampling.
If the error happens before that, try increasing the deployment time-out and volume_size (although I think tiny-llama would fit).
@ari-vedant-jain did you try my suggestion ?
This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread. Thank you!
This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread. Thank you!
This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread. Thank you!
This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread. Thank you!
This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread. Thank you!
This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread. Thank you!
This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread. Thank you!
System Info
Who can help?
No response
Information
Tasks
examples
folder (such as GLUE/SQuAD, ...)Reproduction (minimal, reproducible, runnable)
Inference-TinyLlama-1.1B.ipynb.txt
Expected behavior
Running the following cell will result in failure in deployment (error attached): log-events-viewer-result (2).csv
model = Model(image_uri=image_uri, model_data=code_artifact, role=role, sagemaker_session = sess)
model._is_compiled_model = True
model.deploy(initial_instance_count=1, instance_type=instance_type, container_startup_health_check_timeout=500, volume_size=200, endpoint_name=endpoint_name)