The fact that UETAI has several datasets, some of them is very large, So it is not feasible to download the dataset everytime the runner is trigger.
I create this submodule as an initiative to manage all those datasets at once.
Usage
The modules will support some function and class to handle dataset management on multiple machine at UETAI.
We create a shared NFS upon our cluster. Every machine with GPU will access and read data from that. The Runner, there for, have to mount to the shared NFS in order to acess the data.
The developer will use data_path function from logger to access to certain dataset given dataset_name
and alias.
The fact that UETAI has several datasets, some of them is very large, So it is not feasible to download the dataset everytime the runner is trigger. I create this submodule as an initiative to manage all those datasets at once.
Usage
The modules will support some function and class to handle dataset management on multiple machine at UETAI. We create a shared NFS upon our cluster. Every machine with GPU will access and read data from that. The Runner, there for, have to mount to the shared NFS in order to acess the data.
The developer will use
data_path
function from logger to access to certain dataset givendataset_name
andalias
.The data_path actually follow the 2 environment scenario:
path
.dataset_name
andalias
.Registry
We create a registry to mapping and validate whether dataset path is correct.
Checklists