Describe the enhancement
The current data load {filename} command takes approximately 4 hours to process a dataset of size 5k (around 12k total embeddings). To improve efficiency and manageability, we propose migrating this command into multiple commands.
Proposed Solution
DataManager Class: Create a new singleton class, DataManager, to manage static memory for loaded data, similar to our other manager classes.
Data Load Command: Modify the data load {filename} command to load the data file into the DataManager's static memory.
Database Build Command: Introduce a new sub-command, database build {tablename}, which:
Checks for data loaded into the DataManager.
Builds each table independently, allowing users to save the database even if the tokens table fails.
To Reproduce
Steps to reproduce the current behavior:
Run the data load {filename} command with a dataset of size 5k.
Observe that the command takes approximately 4 hours to complete.
Expected behavior
The data load {filename} command should quickly load data into static memory managed by DataManager.
The database build {tablename} command should independently build each table, improving efficiency and allowing partial saves.
Screenshots
If applicable, add screenshots to help explain the proposed enhancement.
Desktop (please complete the following information):
OS: Windows 11
GPU: RTX 3080
Unity Version: 6.0.21
Additional context
This enhancement is aimed at optimizing the data load process for the "Tau" project. By splitting the command into multiple steps and introducing a DataManager class, we can significantly reduce the time required for data loading and improve overall system reliability.
Describe the enhancement The current
data load {filename}
command takes approximately 4 hours to process a dataset of size 5k (around 12k total embeddings). To improve efficiency and manageability, we propose migrating this command into multiple commands.Proposed Solution
DataManager
, to manage static memory for loaded data, similar to our other manager classes.data load {filename}
command to load the data file into theDataManager
's static memory.database build {tablename}
, which:DataManager
.To Reproduce Steps to reproduce the current behavior:
data load {filename}
command with a dataset of size 5k.Expected behavior
data load {filename}
command should quickly load data into static memory managed byDataManager
.database build {tablename}
command should independently build each table, improving efficiency and allowing partial saves.Screenshots If applicable, add screenshots to help explain the proposed enhancement.
Desktop (please complete the following information):
Additional context This enhancement is aimed at optimizing the
data load
process for the "Tau" project. By splitting the command into multiple steps and introducing aDataManager
class, we can significantly reduce the time required for data loading and improve overall system reliability.