Hey I wanted to know if I can use a data generator for data that is too large to fit into memory. I know XGBoost and other similar gradient boosted models have a very slow implementation of to read and re-read a CSV into memory for large datasets but that is much too slow for me. I was hoping I could use a generator with XBNet and read a large H5 file of tab data into memory iteratively and train on the entire dataset the way keras models do it.
Hey I wanted to know if I can use a data generator for data that is too large to fit into memory. I know XGBoost and other similar gradient boosted models have a very slow implementation of to read and re-read a CSV into memory for large datasets but that is much too slow for me. I was hoping I could use a generator with XBNet and read a large H5 file of tab data into memory iteratively and train on the entire dataset the way keras models do it.