Open filonenkoa opened 7 years ago
I don't think we used CUDA 8 or Visual Studio 2015 (we used 7.5 and VS 2013). It should work in the older version of CUDA and VS. Looking at your output, it seems like it's loading way too many elements into VRAM at once. Maybe something wrong with memcpy? I'm not sure how to troubleshoot between versions of CUDA or VS, so you're on your own there. I'm also preetty rusty at CUDA, and didn't write the guts of the code. We haven't been maintaining this project, as it was for a one-off for school.
I wrote that part, but I have no idea how it works a year or so later. I don't even own an Nvidia GPU anymore, so I can't debug it. I'd suggest using some other, actually-supported image recognition software.
Thank you for your response! I will try to modify this code to make it work on new CUDA and VS versions. Have you published any paper related to this project so I can reference you if I ever finish these fixes and submit a manuscript to a conference using chunks of your code?
Let us know when you get it working! We haven't published anything, but feel free to use the code however you'd like.
I was trying to launch the source code on Cuda 8.0 and Visual Studio 2015 Community and I got error due to enormous memory consumption. I'm using Nvidia GTX 980 Ti (6 GB) and 32 GB RAM.
This is the console output:
Loading images from disk. This will take a few seconds. Loaded 5923 positive images and 54077 negative images from file (179 MB). Finding features in positive set. CUDA memory: Total: 6144 MB, free: 4999 MB Computing up to 66429 images at once using 3999 MB of VRAM and 1 kernels Using 189536 threads Kernel 1 output sized 10867584 elements (124 MB, 3% full) Copying buffer to host Found 10867584 positive features Finding features in negative set. CUDA memory: Total: 6144 MB, free: 4999 MB Computing up to 66429 images at once using 3999 MB of VRAM and 1 kernels Using 1730464 threads ERROR: Failed to run stmt cudaMemcpy(&hostFeatureIndex, deviceFeatureIndex, sizeof(uint32_t), cudaMemcpyDeviceToHost) ERROR: Got CUDA error ... unspecified launch failure Kernel 1 output sized 3563186288 elements (40777 MB, 1072% full) Buffer overflow by 3231041288 features, increase FEATURES_PER_IMAGE or THRESHOLD Resizing host buffer to 664290000 elements (7602 MB) Copying buffer to host ERROR: Failed to run stmt cudaMemcpy(&(finishedFeatures[numFinishedFeatures]), deviceFeatureBuffer, hostFeatureIndex * sizeof(feature), cudaMemcpyDeviceToHost) ERROR: Got CUDA error ... unspecified launch failure ERROR: Failed to run stmt cudaFree(deviceFeatureBuffer) ERROR: Got CUDA error ... unspecified launch failure ERROR: Failed to run stmt cudaFree(deviceImageBuffer) ERROR: Got CUDA error ... unspecified launch failure ERROR: Failed to run stmt cudaFree(deviceFeatureIndex) ERROR: Got CUDA error ... unspecified launch failure Found 3563186288 negative features Computation took 6035 ms