nagejacob / TBSN

GNU General Public License v3.0
25 stars 2 forks source link

heatmap code #2

Open kaiq663 opened 6 months ago

kaiq663 commented 6 months ago

Receptive field heatmaps like Fig6&Fig1 seem to be similar to the PUCA version.I try my best to implement with the code before but come to a failure.

What's more, there are not realtive codes in your project about this part. Could you share this part of code or tell me where can I learn form that?

My email adrress is kaiq663@foxmail.com, thanks!

image image

nagejacob commented 6 months ago

Thank you for your interest, I'v uploaded the code in validate/visualize_receptive_field.py. Please feel free to communicate with me.

kaiq663 commented 6 months ago

Thank you for your interest, I'v uploaded the code in validate/visualize_receptive_field.py. Please feel free to communicate with me.

Thanks!This rf code really tortures me before.

Since you do not provide training code, I utilize your SASL training code which use data_parallel(DP) with serval modifications.It works nice!

However, I notice that there is a distribution_parallel function in mode/base/BaseModel.Therefore, I reimplement SASL training code with DDP. Unsurprisingly, my project has been killed for the lack of RAM(64G for my server).

My questions are as follows.

question1: does DDP gain a higher performance than DP?

question2: if so, is there a way to run DDP within limited RAM?

nagejacob commented 6 months ago
  1. With proper learning rate scaling, DDP has same results as DP. 2. The out of memory is due to that DDP duplicate the data_loader to the world_size, so the training data pinned in the memory multiplied with the world_size. You could set pin_memory to false, or maintain the training data in the shared memory so it do not duplicate.
nagejacob commented 6 months ago

Actually, I do not use DDP in this work, DP is enough to train on two GPUs.