verlab / DescriptorReasoning_ACCV_2024

16 stars 3 forks source link

CUDA running memory #2

Closed galahies closed 3 weeks ago

galahies commented 1 month ago

Respectfully ,due to four v100 is about 128GB running memory ,but my Server Cluster is about 12*8 ~96GB running memory,l wonder that if l can use it reproduce it , l am afraid Also are there any plans to publicly release the code logic in the near future ,Thank you for your reading

felipecadar commented 1 month ago

Hi @galahies. Thanks for the interest in our work! For the training, I suggest you use about 1/3 or 1/4 of the batch size I used. In other hand you can also add batch accumulation to maintain the benefits of the bigger batch. Something like this should work:

python reasoning/train_multigpu_reasoning.py \
    --batch_size 4 \ # reduced batch size
    --batch_accumulation 4 \ # compensate with batch accumulation
    --data ./datasets/h5_scannet \ 
    --plot_every 200 \ 
    --extractor_cache 'xfeat-scannet-n2048' \ 
    --dino_cache 'dino-scannet-dinov2_vits14' \ 
    -C xfeat-dinov2 

As for the release the code logic, I did not understand the question. The code is already released. Soon I will write a tutorial for the evaluation, but the code is already in this repository.