Training data size - Githubissues

BeaTrier commented 2 years ago

I was computing the training data on the server, but the administrator told me that I have exceeded my BLOCK quota limits and the computing progress have not finished. So I want to know how big is the training dataset?
Would you mind letting me know the minimum GPU capacity requirement for running the code?

Thanks in advance!

vincentcartillier commented 2 years ago

The training data is about 1.4T. The preprocessing script will save the intermediate egocentric features for all 20 steps in the training episode. Each egocentric feature map has a dimension of 240x320x64, with 20 frames per episode and roughly 3400 episodes in train and val, this leads to ~1.4T. One solution to overcome this is to only save the images during the preprocessing step and then compute the egocentric features during training.

We trained SMNet with a batch size of 8 over 8 TitanXps (each has 12G memory). Depending on your application I believe it could be possible to finetune SMNet on a smaller set of GPUs with a smaller batch size.

BeaTrier commented 2 years ago

Hi,

Thanks for your intelligent work. Would you be able to provide the ground truth map and path info of Replica? I noticed that you also test it on Replica, but it's not available on this repo. Thanks a lot!

vincentcartillier commented 2 years ago

Hi, I have added the manually recorded exploration paths for Replica under data_replica. However, you would need to recompute the ground truth maps. Following these two scripts should be pretty straightforward:

data/build_point_cloud_from_mesh_h5.py
compute_GT_topdown_semantic_maps/build_semmap_from_obj_point_cloud.py

vincentcartillier / Semantic-MapNet

Training data size #12