Open ghost opened 5 months ago
Thank you for creating this @Jo316!
Please can you explain what specific functionality are you looking from Training Operator to support training models with spatial datasets ? Do you require some distributed capabilities and you want to leverage Training Operator controller to orchestrate the appropriate resources on Kubernetes ?
As long as you can create container from your training script where you use the geographical datasets, you can run it within Training Operator.
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.
/remove-lifecycle stale
What you would like to be added?
I would like to request the addition of functions to the Training Operator for training models with spatial (geographical) datasets. These functions should enable seamless integration and processing of geographical data, leveraging state-of-the-art algorithms to enhance model accuracy and applicability in spatial contexts.
One potential reference is the R package CAST, which provides robust functions for training models with geographical data using random forest. The package offers a comprehensive approach to handling spatial data, including considerations for the Area of Applicability.
Functions Ranked By Importance/ Need (https://hannameyer.github.io/CAST/reference/index.html:
Why is this needed?
The integration of spatial dataset training functions will significantly enhance the Training Operator's capabilities, particularly for users working with geographical data. It will allow for more accurate and relevant model training in fields such as environmental science, urban planning, and geospatial analysis.
By incorporating these functions, the Training Operator will support a wider range of use cases and applications, making it a more versatile and powerful tool for data scientists and researchers.
Love this feature?
Give it a 👍 We prioritize the features with most 👍