This PR introduces the MeshDatapipe class, which integrates NVIDIA DALI to efficiently process vtp, vtu, cgns files within PyTorch pipelines. Key features include support for lazy-loading and normalization of attributes using pre-computed statistics, configurable options for batch size, device selection, and parallel processing for scalable multi-process environments. The update aims to enhance data processing performance by leveraging DALI's GPU capabilities, providing a streamlined solution for handling large mesh datasets in machine learning and simulation applications. Additionally, the PR includes robust error handling, new helper functions for VTK data parsing, and thorough documentation to guide users on setting up and using the pipeline.
Modulus Pull Request
This PR introduces the MeshDatapipe class, which integrates NVIDIA DALI to efficiently process vtp, vtu, cgns files within PyTorch pipelines. Key features include support for lazy-loading and normalization of attributes using pre-computed statistics, configurable options for batch size, device selection, and parallel processing for scalable multi-process environments. The update aims to enhance data processing performance by leveraging DALI's GPU capabilities, providing a streamlined solution for handling large mesh datasets in machine learning and simulation applications. Additionally, the PR includes robust error handling, new helper functions for VTK data parsing, and thorough documentation to guide users on setting up and using the pipeline.
Description
Closes https://github.com/NVIDIA/modulus/issues/500
Checklist
Dependencies