Sharpen vision statement about hardware-independent; prediction can happen on-device (e.g. edge devices) although in many (but not all) cases training happens elsewhere. Approaches such as federated learning are not included for now, although that would require even more different infrastructure.
Make sure 'training set' is really 'training/testing/validation data set' as that is not quite clear. These are all needed to train a model.
Add sampling comment to speed up development.
Mention online/offline needs.
Mention feature stores to accelerate development.
Suggestions/questions:
Perhaps a section on definition or a separate document may be helpful, especially for people who may not be familiar with the terminology.
'Agile for ML' sort of exists in the shape of CRISP-DM. I'm not sure that's what's meant as that's heavy on the modelling process and less on the CI/CD aspects. It may be helpful to mention that as data science practitioners may be more familiar with CRISP-DM/SEMMA and the likes than CI/CD. In most methodologies 'deployment' is a single box that glosses over the intricacies of actually deploying models into production. I can add a section if desired. Let me know.
Suggestions/questions: