FENGGENYU / CAPRI-Net

Code for CAPRI-Net
MIT License
43 stars 6 forks source link

Running time #3

Closed bowensu123 closed 1 year ago

bowensu123 commented 1 year ago

Dear Fenggen, Really appreciate your released code! Could you help check the version of your code posted? It takes around 19 hour for me to run on HPCC of my university with GPU V100 , which nearly double times of your training time mentioned in your paper (using GPU inferior than V100). Could you please help me figure out which parts I make wrong? Thanks in advance!

FENGGENYU commented 1 year ago

I just fixed some small bugs and updated the train.py. Please download the new one and make sure the 'leaky' flag is False.

The training time is about 6 hours.

2.

FENGGENYU commented 1 year ago

Btw, we train 1000 epochs on ABC with grid sample 64. For Shapenet, we use the progressive training strategy as IM-Net: 250 epochs with grid_sample 16, 250 epochs with grid_sample 32 and 500 epochs with grid_sample 64. ShapeNet training needs ~3 days.

bowensu123 commented 1 year ago

Hi, Fenggen,

Thanks for your reply! I am not sure which part I got wrong. I tested it multiple times and still around 90s /epoch. My advisor requires me to make it as fast as mentioned time. Thanks a lot for your help!

sincerely, Bowen


发件人: FENGGEN YU @.> 发送时间: 2022年12月5日 14:38 收件人: FENGGENYU/CAPRI-Net @.> 抄送: Su, Bowen @.>; Manual @.> 主题: Re: [FENGGENYU/CAPRI-Net] Running time (Issue #3)

I just fixed some small bugs and updated the train.py. Please download the new one and make sure the 'leaky' flag is False.

The training time is about 6 hours.

[2]https://urldefense.com/v3/__https://user-images.githubusercontent.com/11835443/205727266-46e3c2b8-5b35-44c5-8ec2-df1bf92569d2.png__;!!HXCxUKc!2v1dvjuunwm1GGeGc19Tr8Wv7bq2UWwVoDpLHlCZJDn6mCAn9djKfQ3yEObAP_B1GDTG9GIokk8B7_0O1oOLqiM$.

― Reply to this email directly, view it on GitHubhttps://urldefense.com/v3/__https://github.com/FENGGENYU/CAPRI-Net/issues/3*issuecomment-1338049739__;Iw!!HXCxUKc!2v1dvjuunwm1GGeGc19Tr8Wv7bq2UWwVoDpLHlCZJDn6mCAn9djKfQ3yEObAP_B1GDTG9GIokk8B7_0OkyvUito$, or unsubscribehttps://urldefense.com/v3/__https://github.com/notifications/unsubscribe-auth/APNTHGD3XVZ6MFEWBU25SZ3WLZADFANCNFSM6AAAAAASTE6ZSY__;!!HXCxUKc!2v1dvjuunwm1GGeGc19Tr8Wv7bq2UWwVoDpLHlCZJDn6mCAn9djKfQ3yEObAP_B1GDTG9GIokk8B7_0OzkLlFKI$. You are receiving this because you are subscribed to this thread.Message ID: @.***>

FENGGENYU commented 1 year ago

How long does it take for fine-tuning on your side with the given weights? Firstly, you need to make sure that your cuda set ups are right.

bowensu123 commented 1 year ago

Hi, Fenggen,

      Really appreciate it your reply! For tunning part it is super fast, around 0.56s/epoch. I use your yml file to install dependency. And CUDA Version I used is CUDA/11.0.2 module load CUDA/11.0.2

Thansk a lot! Best, Bowen


发件人: FENGGEN YU @.> 发送时间: 2022年12月5日 16:11 收件人: FENGGENYU/CAPRI-Net @.> 抄送: Su, Bowen @.>; Manual @.> 主题: Re: [FENGGENYU/CAPRI-Net] Running time (Issue #3)

How long does it take for fine-tuning on your side with the given weights? Firstly, you need to make sure that your cuda set ups are right.

— Reply to this email directly, view it on GitHubhttps://urldefense.com/v3/__https://github.com/FENGGENYU/CAPRI-Net/issues/3*issuecomment-1338168915__;Iw!!HXCxUKc!x90HNFBcpAuUoRRFmvVCnwcR-kGsRperVLW3IzbX0XwkWZCPTfpMhuJ4UysTRxl5MxfitIwTO1E6OLvlA15OTXA$, or unsubscribehttps://urldefense.com/v3/__https://github.com/notifications/unsubscribe-auth/APNTHGAB7X2R6TZ6DOWBU6DWLZLADANCNFSM6AAAAAASTE6ZSY__;!!HXCxUKc!x90HNFBcpAuUoRRFmvVCnwcR-kGsRperVLW3IzbX0XwkWZCPTfpMhuJ4UysTRxl5MxfitIwTO1E6OLvlQwCpZWg$. You are receiving this because you are subscribed to this thread.Message ID: @.***>

FENGGENYU commented 1 year ago

It is also slower than mine, 0.28s/epoch.

image

If you can provide screenshots of your commands and the prints, it would be helpful.

bowensu123 commented 1 year ago

Dear FengGen,

Thanks for your help! I have fixed it by resetting up correct cuda version as you suggest. For trainning part, it takes around 15s/epoch. Thanks a lot for your help!

Best, Bowen


发件人: FENGGEN YU @.> 发送时间: 2022年12月5日 19:31 收件人: FENGGENYU/CAPRI-Net @.> 抄送: Su, Bowen @.>; Manual @.> 主题: Re: [FENGGENYU/CAPRI-Net] Running time (Issue #3)

It is also slower than mine, 0.28s/epoch.

[image]https://urldefense.com/v3/__https://user-images.githubusercontent.com/11835443/205776356-322001c4-33fa-4f77-8021-248f14afb362.png__;!!HXCxUKc!19EScM_3tbbPMTKakEOrrZb-3PkJoeFCL2yIoZsw87TlK66bYGFuKQp-7eDsO0iCJa7DI0xDkeNRRGDilQh4tQ8$

If you can provide screenshots of your commands and the prints, it would be helpful.

― Reply to this email directly, view it on GitHubhttps://urldefense.com/v3/__https://github.com/FENGGENYU/CAPRI-Net/issues/3*issuecomment-1338486904__;Iw!!HXCxUKc!19EScM_3tbbPMTKakEOrrZb-3PkJoeFCL2yIoZsw87TlK66bYGFuKQp-7eDsO0iCJa7DI0xDkeNRRGDiqmaSv_Y$, or unsubscribehttps://urldefense.com/v3/__https://github.com/notifications/unsubscribe-auth/APNTHGGIA7UUVOOOV5S5PMDWL2CPJANCNFSM6AAAAAASTE6ZSY__;!!HXCxUKc!19EScM_3tbbPMTKakEOrrZb-3PkJoeFCL2yIoZsw87TlK66bYGFuKQp-7eDsO0iCJa7DI0xDkeNRRGDiHs21Zo0$. You are receiving this because you are subscribed to this thread.Message ID: @.***>

wanghanxiao123 commented 10 months ago

Dear FengGen,

I noticed that your model has a fine-tuning efficiency of 0.28 seconds/epoch. I also used the yml file you provided to install all dependencies, but my training time is approximately 0.4 seconds/epoch. Considering that we are using the same configuration file, I am curious about the cause of this performance difference and looking for possible solutions.

Currently, I am using CUDA/11.0.2. I would like to ask if it is necessary to change the CUDA version to improve training efficiency? If so, which version of CUDA would you recommend?

Best, Felix