Closed v4if closed 3 days ago
Hi @v4if you should make sure to call https://pytorch.org/docs/stable/generated/torch.cuda.synchronize.html when running gpu benchmarks. I don't think %timeit
does that for you.
Hi @v4if you should make sure to call https://pytorch.org/docs/stable/generated/torch.cuda.synchronize.html when running gpu benchmarks. I don't think
%timeit
does that for you.
There is a difference in the time on the GPU with and without sync.
But my question is why the difference in CPU time consumption in different environments is so big.
this is indeed weird. forgive me for these silly questions, are venv1 and venv2 in the same system?? (asking because the terminal font style looks different).
this is indeed weird. forgive me for these silly questions, Does venv1 and venv2 are in the same system?? (asking because the terminal font style looks different).
It is on two machines, but the installed torchvision version 0.20.0 is the same. Donāt know why the time-consuming difference on the CPU is so big.
It is on two machines, but the installed torchvision version 0.20.0 is the same. Donāt know why the time-consuming difference on the CPU is so big.
If you are running code on different system then it is expected. Depending on the number of cores, RAM and the type of CPU; you will have different speeds.
I have intel i7 11th gen with 16GB of ram and here's my benchmarking.
It is on two machines, but the installed torchvision version 0.20.0 is the same. Donāt know why the time-consuming difference on the CPU is so big.
If you are running code on different system then it is expected. Depending on the number of cores, RAM and the type of CPU; you will have different speeds.
I have intel i7 11th gen with 16GB of ram and here's my benchmarking.
The number of cores, RAM and the type of CPU, what are the main factors that determine the execution speed? Due to the existence of GIL, python should use a single core, or multi-core will be used in the torchvision implementation? The CPU frequency of the above two machines They are all around 3000MHz, why is there a gap of several times.
cat /proc/cpuinfo |grep MHz|uniq
Moreover, the execution speed on my local mac is us level, which is several orders of magnitude faster than the above two server machines.
@v4if It's expected to see different performancec on different machines. Some ops (like resize) leverage SIMD operation. E.g. if one of your machine has AVX2 while the other one doesn't, you'll see massive differences.
I don't think this issue is really in scope for torchvision (especially not with that level of details), so I'll close the issue.
Thanks Nicolas for clarification.
@v4if This website has benchmarking for pytorch against different hardware.
https://openbenchmarking.org/test/pts/pytorch
On Tue, 29 Oct 2024, 16:26 Nicolas Hug, @.***> wrote:
@v4if https://github.com/v4if It's expected to see different performancec on different machines. Some ops (like resize) leverage SIMD operation. E.g. if one of your machine has AVX2 while the other one doesn't, you'll see massive differences.
I don't think this issue is really in scope for torchvision (especially not with that level of details), so I'll close the issue.
ā Reply to this email directly, view it on GitHub https://github.com/pytorch/vision/issues/8700#issuecomment-2443890221, or unsubscribe https://github.com/notifications/unsubscribe-auth/ARLRQF7UVWSEVVG25M56LLTZ55SXFAVCNFSM6AAAAABQWYJUS2VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDINBTHA4TAMRSGE . You are receiving this because you commented.Message ID: @.***>
-- The information contained in this electronic communication is intended solely for the individual(s) or entity to which it is addressed. It may contain proprietary, confidential and/or legally privileged information. Any review, retransmission, dissemination, printing, copying or other use of, or taking any action in reliance on the contents of this information by person(s) or entities other than the intended recipient is strictly prohibited and may be unlawful. If you have received this communication in error, please notify us by responding to this email or telephone and immediately and permanently delete all copies of this message and any attachments from your system(s). The contents of this message do not necessarily represent the views or policies of BITS Pilani.
š Describe the bug
%timeit resize_trans(x_cpu) %timeit resize_trans(x_gpu)
The difference in CPU time consumption under different environments is 2.65 times. 11.3ms vs 30ms.
venv1:
venv2:
Versions
venv1
venv2