Open samir-souza opened 1 year ago
This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread. Thank you!
This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread. Thank you!
This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread. Thank you!
This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread. Thank you!
This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread. Thank you!
This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread. Thank you!
This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread. Thank you!
This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread. Thank you!
This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread. Thank you!
I created a workshop/tutorial for Amazon SageMaker + HF Optimum Neuron and in the notebooks I extract some metadata from Optimum Neuron to plot tables of compatible models for training and inference. However, to reach that info, you need to access a lot of Constants and moving parts, which makes the process complex and fragile (if the API changes, it can break this code). Here you can find the workshop: https://github.com/aws-samples/ml-specialized-hardware
It has 3 notebooks: data prep, fine tuning and deployment. In the fine tuning notebook I run this code to get the metadata: Optimum-neuron: 0.0.8
I also run something similar in the deployment/inference notebook.
You can see the results here: https://github.com/aws-samples/ml-specialized-hardware/blob/main/purpose-built-accelerators/docs/optimum_neuron_models.md
It would be good if you create a high-level API to do this job, something like:
In both methods, the models metadata should include TP=True/False to indicate it supports Tensor Parallelism or not. What do you think?