The CNN is unusually slow,what am I doing wrong?

git-Charlie commented 2 years ago

dotnet --version 6.0.200 CPU: AMD Ryzen 7 1700x 8 core GPU: NVIDIA GeForce GTX 1080, Driver version 30.0.14.7212 CUDA v11.1

FaceRecognitionDotNet-master\examples\Benchmark>

This is without using CNN

dotnet run -c Release -- "-m=models"

Benchmarks

Timings at 240p:
 - Face locations: 0.0556s (17.99 fps)
 - Face landmarks: 0.0034s (294.12 fps)
 - Encode face (inc. landmarks): 0.3204s (3.12 fps)
 - End-to-end: 0.3766s (2.66 fps)

Timings at 480p:
 - Face locations: 0.2188s (4.57 fps)
 - Face landmarks: 0.0034s (294.12 fps)
 - Encode face (inc. landmarks): 0.3270s (3.06 fps)
 - End-to-end: 0.5496s (1.82 fps)

Timings at 720p:
 - Face locations: 0.4962s (2.02 fps)
 - Face landmarks: 0.0036s (277.78 fps)
 - Encode face (inc. landmarks): 0.3458s (2.89 fps)
 - End-to-end: 0.8340s (1.20 fps)

Timings at 1080p:
 - Face locations: 1.1034s (0.91 fps)
 - Face landmarks: 0.0036s (277.78 fps)
 - Encode face (inc. landmarks): 0.3250s (3.08 fps)
 - End-to-end: 1.4276s (0.70 fps)

this is proj file

<Project Sdk="Microsoft.NET.Sdk">

  <PropertyGroup>
    <OutputType>Exe</OutputType>
    <TargetFramework>netcoreapp2.0</TargetFramework>
    <Authors>Takuya Takeuchi</Authors>
    <Description>Example of FaceRecognitionDotNet</Description>
  </PropertyGroup>

  <ItemGroup>
    <PackageReference Include="FaceRecognitionDotNet" Version="1.3.0.6" />
    <PackageReference Include="FaceRecognitionDotNet.CUDA111" Version="1.3.0.6" />
    <PackageReference Include="Microsoft.Extensions.CommandLineUtils" Version="1.1.1" />
  </ItemGroup>

</Project>

file in models

.keepfolder
dlib_face_recognition_resnet_model_v1.dat
mmod_human_face_detector.dat
resnet34_1000_imagenet_classifier.dnn
shape_predictor_5_face_landmarks.dat
shape_predictor_68_face_landmarks.dat

this is File in \Benchmark\bin\x64\Release\netcoreapp2.0


Benchmark.deps.json
Benchmark.dll
Benchmark.pdb
Benchmark.runtimeconfig.dev.json
Benchmark.runtimeconfig.json
cudnn.lib
cudnn64_8.dll
cudnn64_8.lib
cudnn_adv_infer.lib
cudnn_adv_infer64_8.dll
cudnn_adv_infer64_8.lib
cudnn_adv_train.lib
cudnn_adv_train64_8.dll
cudnn_adv_train64_8.lib
cudnn_cnn_infer.lib
cudnn_cnn_infer64_8.dll
cudnn_cnn_infer64_8.lib
cudnn_cnn_train.lib
cudnn_cnn_train64_8.dll
cudnn_cnn_train64_8.lib
cudnn_ops_infer.lib
cudnn_ops_infer64_8.dll
cudnn_ops_infer64_8.lib
cudnn_ops_train.lib
cudnn_ops_train64_8.dll
cudnn_ops_train64_8.lib
DlibDotNet.dll
DlibDotNetNative.dll
DlibDotNetNativeDnn.dll
DlibDotNetNativeDnnAgeClassification.dll
DlibDotNetNativeDnnEmotionClassification.dll
DlibDotNetNativeDnnGenderClassification.dll
FaceRecognitionDotNet.dll

the cudnn files, I copy from C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v11.1**

v11.1 is the only version on my computer.

when use CNN, It took about half an hour to show these

Benchmarks

Timings at 240p:
 - Face locations: 9.2098s (0.11 fps)
 - Face landmarks: 0.0034s (294.12 fps)
 - Encode face (inc. landmarks): 0.3346s (2.99 fps)
 - End-to-end: 9.5358s (0.10 fps)

In project FaceDetection, if Hog is changed to CNN, it will also be unusually slow. It takes about twenty times the time of Hog.

In Task Manager, the CPU usage at runtime is about 8%, GPU is zero.

I don't know if I need to provide any more information, please help me, thanks.

takuya-takeuchi commented 2 years ago

<PackageReference Include="FaceRecognitionDotNet" Version="1.3.0.6" />

This property is not necessary. Could you remove it and try again?

git-Charlie commented 2 years ago

<PackageReference Include="FaceRecognitionDotNet" Version="1.3.0.6" />

This property is not necessary. Could you remove it and try again?

Thank you for your reply，

When I remove this property

System.TypeInitializationException: The type initializer for 'DlibDotNet.NativeMethods' threw an exception. ---> System.DllNotFoundException: Unable to load DLL 'DlibDotNetNativeDnn' or one of its dependencies

So I manually copied the file "DlibDotNetNative.dll" and "DlibDotNetNativeDnn.dll" to the program directory.

The execution result is the same as before, no change.

takuya-takeuchi commented 2 years ago

@git-Charlie

I tried benchmark with cnn. Before this, I added FRDN.CUDA111 by using `dotnet add pacakge. Next, I copied cudnn these libs and DlibDotNetNativeDnn.dll from DlibDotNet.CUDA111.

cublas64_11.dll
cublasLt64_11.dll
cudnn_adv_infer64_8.dll
cudnn_adv_train64_8.dll
cudnn_cnn_infer64_8.dll
cudnn_cnn_train64_8.dll
cudnn_ops_infer64_8.dll
cudnn_ops_train64_8.dll
cudnn64_8.dll

Then

>dotnet run -c Release -- -c -m=D:\Works\OpenSource\FaceRecognitionDotNet.Models
[D:\Works\OpenSource\FaceRecognitionDotNet\examples\Benchmark\Benchmark.csproj]
Benchmarks

Timings at 240p:
 - Face locations: 0.0180s (55.56 fps)
 - Face landmarks: 0.0014s (714.29 fps)
 - Encode face (inc. landmarks): 0.0040s (250.00 fps)
 - End-to-end: 0.0226s (44.25 fps)

Timings at 480p:
Unhandled exception. System.EntryPointNotFoundException: Unable to find an entry point named 'cuda_cudaDriverGetVersion' in DLL 'DlibDotNetNativeDnn'.
   at DlibDotNet.NativeMethods.dnn_cuda_cudaDriverGetVersion(Int32& version)
   at DlibDotNet.Dnn.Cuda.ThrowCudaException(ErrorType error)
   at DlibDotNet.Dnn.LossMmod.Operator[T](IEnumerable`1 images, UInt64 batchSize)
   at FaceRecognitionDotNet.Dlib.Python.CnnFaceDetectionModelV1.Detect(LossMmod net, Image image, Int32 upsampleNumTimes)
   at FaceRecognitionDotNet.FaceRecognition.FaceLocations(Image image, Int32 numberOfTimesToUpsample, Model model)
   at Benchmark.Program.TestLocateFaces(Image image) in D:\Works\OpenSource\FaceRecognitionDotNet\examples\Benchmark\Program.cs:line 167
   at Benchmark.Program.<>c__DisplayClass3_0`1.<RunTest>b__0() in D:\Works\OpenSource\FaceRecognitionDotNet\examples\Benchmark\Program.cs:line 100
   at Benchmark.Program.<>c__DisplayClass3_0`1.<RunTest>b__1(Int32 i) in D:\Works\OpenSource\FaceRecognitionDotNet\examples\Benchmark\Program.cs:line 107
   at System.Linq.Enumerable.SelectIPartitionIterator`2.MoveNext()
   at System.Linq.Enumerable.Min(IEnumerable`1 source)
   at Benchmark.Program.RunTest[T](String path, Func`2 setup, Action`1 test, Int32 iterationsPerTest, Int32 testsToRun, Boolean useCnn) in D:\Works\OpenSource\FaceRecognitionDotNet\examples\Benchmark\Program.cs:line 107
   at Benchmark.Program.<>c__DisplayClass2_0.<Main>b__0() in D:\Works\OpenSource\FaceRecognitionDotNet\examples\Benchmark\Program.cs:line 73
   at Microsoft.Extensions.CommandLineUtils.CommandLineApplication.Execute(String[] args)
   at Benchmark.Program.Main(String[] args) in D:\Works\OpenSource\FaceRecognitionDotNet\examples\Benchmark\Program.cs:line 88

Program crashed but CNN works. I have to fix it.

And Hog mode

>dotnet run -c Release -- -m=D:\Works\OpenSource\FaceRecognitionDotNet.Models[D:\Works\OpenSource\FaceRecognitionDotNet\examples\Benchmark\Benchmark.csproj]
Benchmarks

Timings at 240p:
 - Face locations: 0.0482s (20.75 fps)
 - Face landmarks: 0.0014s (714.29 fps)
 - Encode face (inc. landmarks): 0.0044s (227.27 fps)
 - End-to-end: 0.0528s (18.94 fps)

Timings at 480p:

Just in case, I confirmed CUDA_PATH.

>set CUDA
CUDA_PATH=C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v9.2
CUDA_PATH_V10_0=C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v10.0
CUDA_PATH_V10_1=C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v10.1
CUDA_PATH_V10_2=C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v10.2
CUDA_PATH_V11_0=C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v11.0
CUDA_PATH_V11_1=C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v11.1
CUDA_PATH_V11_2=C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v11.2
CUDA_PATH_V11_3=C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v11.3
CUDA_PATH_V11_6=C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v11.6
CUDA_PATH_V9_0=C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v9.0
CUDA_PATH_V9_1=C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v9.1
CUDA_PATH_V9_2=C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v9.2

So CUDA_PATH does not afffect behaviors of program if proper libraries are present.

Does your machine recognize gpu?

git-Charlie commented 2 years ago

Thank you for your reply. I reinstalled

cuda_11.1.0_456.43_win10.exe
511.79-desktop-win10-win11-64bit-international-dch-whql.exe
cudnn_8.3.2.44_windows.exe

Now it works normally. Compared with your previous replies, I found that I was missing 2 files cublas64_11.dll and cublasLt64_11.dll

after that , "Program crashed but CNN works."

but FaceDetection, if Hog is changed to CNN, still unusually slow. It

Benchmarks

Timings at 240p:
 - Face locations: 0.0224s (44.64 fps)
 - Face landmarks: 0.0032s (312.50 fps)
 - Encode face (inc. landmarks): 0.0050s (200.00 fps)
 - End-to-end: 0.0274s (36.50 fps)

Timings at 480p:

Unhandled Exception: DlibDotNet.CudaException: Exception of type 'DlibDotNet.CudaException' was thrown.
   at DlibDotNet.Dnn.Cuda.ThrowCudaException(ErrorType error)
   at DlibDotNet.Dnn.LossMmod.Operator[T](IEnumerable`1 images, UInt64 batchSize)
   at FaceRecognitionDotNet.Dlib.Python.CnnFaceDetectionModelV1.Detect(LossMmod net, Image image, Int32 upsampleNumTimes)
   at FaceRecognitionDotNet.FaceRecognition.FaceLocations(Image image, Int32 numberOfTimesToUpsample, Model model)
   at Benchmark.Program.TestLocateFaces(Image image) in A:\learn\FaceRecognitionDotNet-master\examples\Benchmark\Program.cs:line 166
   at Benchmark.Program.<>c__DisplayClass3_0`1.<RunTest>b__0() in A:\learn\FaceRecognitionDotNet-master\examples\Benchmark\Program.cs:line 100
   at System.Linq.Enumerable.SelectIPartitionIterator`2.MoveNext()
   at System.Linq.Enumerable.Min(IEnumerable`1 source)
   at Benchmark.Program.RunTest[T](String path, Func`2 setup, Action`1 test, Int32 iterationsPerTest, Int32 testsToRun, Boolean useCnn) in A:\learn\FaceRecognitionDotNet-master\examples\Benchmark\Program.cs:line 107
   at Benchmark.Program.<>c__DisplayClass2_0.<Main>b__0() in A:\learn\FaceRecognitionDotNet-master\examples\Benchmark\Program.cs:line 73
   at Microsoft.Extensions.CommandLineUtils.CommandLineApplication.Execute(String[] args)
   at Benchmark.Program.Main(String[] args) in A:\learn\FaceRecognitionDotNet-master\examples\Benchmark\Program.cs:line 88

bigorange1900 commented 1 year ago

use cuda92 and cnn mode, about 1s,,,Can you compile a cuda115? TKS!!

takuya-takeuchi commented 1 year ago

@bigorange1900

Unhandled Exception: DlibDotNet.CudaException: Exception of type 'DlibDotNet.CudaException' was thrown.

Could you check content of CudaException? CudaException has detail message and error code to inspect problem.

takuya-takeuchi / FaceRecognitionDotNet

The CNN is unusually slow,what am I doing wrong? #198