vosen / ZLUDA

CUDA on non-NVIDIA GPUs
https://vosen.github.io/ZLUDA/
Apache License 2.0
9.71k stars 635 forks source link

Meshroom 2023.3.0 fails at DepthMap stage #79

Closed harakiru closed 6 months ago

harakiru commented 9 months ago
Hardware : 
        Detected core count : 16
        OpenMP will use 16 cores
        Detected available memory : 22375 Mo

[20:48:00.928483][warning] Cannot get available memory information for CUDA gpu device 0:
         (error code: 801) cudaErrorNotSupported
[20:48:00.918372][warning] CUDA-Enabled GPU.
Device information:
        - id:                      0
        - name:                    AMD Radeon RX 6800 XT [ZLUDA]
        - compute capability:      8.8
        - clock frequency (kHz):   2575000
        - total device memory:     16368 MB 
        - device memory available: 0 MB 
        - per-block shared memory: 65536
        - warp size:               32
        - max threads per block:   1024
        - max threads per SM(X):   2048
        - max block sizes:         {1024,1024,1024}
        - max grid sizes:          {2147483647,65536,65536}
        - max 2D array texture:    {16384,16384}
        - max 3D array texture:    {16384,16384,8192}
        - max 2D linear texture:   {0,0,0}
        - max 2D layered texture:  {16384,16384,65535}
        - number of SM(x)s:        36
        - registers per SM(x):     65536
        - registers per block:     65536
        - concurrent kernels:      yes
        - mapping host memory:     yes
        - unified addressing:      yes
        - texture alignment:       256 byte
        - pitch alignment:         256 byte
CUDA build version: 11.3
[20:48:00.928550][info] Supported CUDA-Enabled GPU detected.
[20:48:00.983995][info] Found 1 image dimension(s): 
[20:48:00.984029][info]         - [4032x3024]
[20:48:00.996774][info] Overall maximum dimension: [4032x3024]
[20:48:00.996837][info] Tiling information: 
        - parameters: 
              - buffer width:  1024 px
              - buffer height: 1024 px
              - padding: 64 px
        - maximum downscale:  4
        - maximum image width:  2016 px
        - maximum image height: 1512 px
        - maximum effective tile width:  896 px
        - maximum effective tile height: 896 px
        - # tiles on X-side: 3
        - # tiles on Y-side: 2
        - effective tile width:  672 px
        - effective tile height: 756 px
        - tile list: 
           - tile (1/6) size: 736x820 px, roi: [x: 0-736, y: 0-820]
           - tile (2/6) size: 736x756 px, roi: [x: 0-736, y: 756-1512]
           - tile (3/6) size: 736x820 px, roi: [x: 672-1408, y: 0-820]
           - tile (4/6) size: 736x756 px, roi: [x: 672-1408, y: 756-1512]
           - tile (5/6) size: 672x820 px, roi: [x: 1344-2016, y: 0-820]
           - tile (6/6) size: 672x756 px, roi: [x: 1344-2016, y: 756-1512]
[20:48:00.996860][info] SGM parameters:
        - scale: 2
        - stepXY: 2
[20:48:00.996880][info] Refine parameters:
        - scale: 1
        - stepXY: 1
[20:48:00.996903][info] Number of GPU devices: 1, number of CPU threads: 16
================================================================================
====================== Command line failed with an error =======================
Could not allocate pinned host memory in /opt/AliceVision_git/src/aliceVision/depthMap/cuda/host/memory.hpp:395, operation not supported: operation not supported
================================================================================

Relevant line of code: https://github.com/alicevision/AliceVision/blob/develop/src/aliceVision/depthMap/cuda/host/memory.hpp

natowi commented 9 months ago

Does ZLUDA work when using the GPU in FeatureExtraction? (enable sift and disable the advanced setting: "force cpu") grafik Is the same error thrown?

harakiru commented 9 months ago

I get a different error this time

Choosing device 0: AMD Radeon RX 6800 XT [ZLUDA]
terminate called after throwing an instance of 'std::runtime_error'
  what():  /tmp/AliceVisionDeps_build/popsift/src/popsift/gauss_filter.cu:245
    cudaMemcpyToSymbol failed for Gauss kernel initialization: operation not supported

However FeatureExtraction can be done on the CPU, and with a powerful system speed is not much of an issue. The real blocker is the DepthMap node, which has to use CUDA.

the-mush commented 9 months ago

just tried it out and I have the exact same problem, also with a 6800XT :(

20943204920434 commented 9 months ago

It looks like a function cudaMemcpyToSymbol() is used here. Obviously it isn't supported. As a solution I can suggest making a patch to substitute the function with cudaMemcpy(), which is similar according to this. Actually, there is a problem, we don't have a clear API compatibility paper, so we cannot say if some function is supported or not.

P.S. I don't know even basic cuda, so I may be wrong sometimes, feel free to correct me.

LeMoonStar commented 9 months ago

I can reproduce the issue using an Radeon RX 7700 xt (+ AMD Ryzen 7600 on Fedora Linux 39)

I also noticed that the first warning is slightly different for me: (error code: 999) cudaErrorUnknown insead of (error code: 801) cudaErrorNotSupported the original issue had.

$ LD_LIBRARY_PATH='/home/user/Downloads/zluda-3-linux/zluda' ROCR_VISIBLE_DEVICES="GPU-****" '/home/user/Downloads/Meshroom-2023.3.0-linux/Meshroom-2023.3.0/Meshroom'

[...]

Hardware : 
    Detected core count : 12
    OpenMP will use 12 cores
    Detected available memory : 22386 Mo

[18:49:20.033367][warning] Cannot get available memory information for CUDA gpu device 0:
     (error code: 999) cudaErrorUnknown
[18:49:19.830787][warning] CUDA-Enabled GPU.
Device information:
    - id:                      0
    - name:                    AMD Radeon RX 7700 XT [ZLUDA]
    - compute capability:      8.8
    - clock frequency (kHz):   2276000
    - total device memory:     12272 MB 
    - device memory available: 0 MB 
    - per-block shared memory: 65536
    - warp size:               32
    - max threads per block:   1024
    - max threads per SM(X):   2048
    - max block sizes:         {1024,1024,1024}
    - max grid sizes:          {2147483647,65536,65536}
    - max 2D array texture:    {16384,16384}
    - max 3D array texture:    {16384,16384,8192}
    - max 2D linear texture:   {0,0,0}
    - max 2D layered texture:  {16384,16384,65535}
    - number of SM(x)s:        27
    - registers per SM(x):     65536
    - registers per block:     65536
    - concurrent kernels:      yes
    - mapping host memory:     yes
    - unified addressing:      yes
    - texture alignment:       256 byte
    - pitch alignment:         256 byte
CUDA build version: 11.3
[18:49:20.033493][info] Supported CUDA-Enabled GPU detected.
[18:49:20.062344][info] Found 1 image dimension(s): 
[18:49:20.062372][info]     - [4032x3024]
[18:49:20.072950][info] Overall maximum dimension: [4032x3024]
[18:49:20.072990][info] Tiling information: 
    - parameters: 
          - buffer width:  1024 px
          - buffer height: 1024 px
          - padding: 64 px
    - maximum downscale:  4
    - maximum image width:  2016 px
    - maximum image height: 1512 px
    - maximum effective tile width:  896 px
    - maximum effective tile height: 896 px
    - # tiles on X-side: 3
    - # tiles on Y-side: 2
    - effective tile width:  672 px
    - effective tile height: 756 px
    - tile list: 
       - tile (1/6) size: 736x820 px, roi: [x: 0-736, y: 0-820]
       - tile (2/6) size: 736x756 px, roi: [x: 0-736, y: 756-1512]
       - tile (3/6) size: 736x820 px, roi: [x: 672-1408, y: 0-820]
       - tile (4/6) size: 736x756 px, roi: [x: 672-1408, y: 756-1512]
       - tile (5/6) size: 672x820 px, roi: [x: 1344-2016, y: 0-820]
       - tile (6/6) size: 672x756 px, roi: [x: 1344-2016, y: 756-1512]
[18:49:20.073007][info] SGM parameters:
    - scale: 2
    - stepXY: 2
[18:49:20.073025][info] Refine parameters:
    - scale: 1
    - stepXY: 1
[18:49:20.073049][info] Number of GPU devices: 1, number of CPU threads: 12
================================================================================
====================== Command line failed with an error =======================
Could not allocate pinned host memory in /opt/AliceVision_git/src/aliceVision/depthMap/cuda/host/memory.hpp:395, unknown error: unknown error
================================================================================

EDIT: The final error message is different as well, I get an unknown error: unknown error, not the operation not supported: operation not supported shown in the original issue.

20943204920434 commented 8 months ago

I've started making a troubleshooting info:

AMD_LOG_LEVEL=3 ``` kirill@kirill-pc ~/Documents/TEMP/Meshroom-2023.3.0 $ AMD_LOG_LEVEL=3 LD_LIBRARY_PATH="/home/kirill/Documents/TEMP/zluda:$LD_LIBRARY_PATH" ./Meshroom [2024-02-23 18:39:19.626237] [0x00007f4895352740] [trace] Embedded OCIO configuration file: '/home/kirill/Documents/TEMP/Meshroom-2023.3.0/aliceVision/share/aliceVision/config.ocio' found. [QtAliceVisionImageIO] Plugin Initialized [QtAliceVision] Plugin Initialized [2024-02-23 18:39:24.664688] [0x00007f6ddeebb000] [trace] Embedded OCIO configuration file: '/home/kirill/Documents/TEMP/Meshroom-2023.3.0/aliceVision/share/aliceVision/config.ocio' found. Program called with the following parameters: * allowSingleView = 1 * allowedCameraModels = "pinhole,radial1,radial3,brown,fisheye4,fisheye1,3deanamorphic4,3deradial4,3declassicld" * colorProfileDatabase = "" (default) * defaultCameraModel = "" (default) * defaultFieldOfView = 45 * defaultFocalLength = -1 (default) * defaultFocalRatio = 1 (default) * defaultOffsetX = 0 (default) * defaultOffsetY = 0 (default) * errorOnMissingColorProfile = 1 (default) * groupCameraFallback = Unknown Type "20EGroupCameraFallback" * imageFolder = "" (default) * input = "/tmp/tmpqc720hjp/CameraInit/961e54591174ec5a2457c66da8eadc0cb03d89ba/viewpoints.sfm" * lensCorrectionProfileInfo = "" * lensCorrectionProfileSearchIgnoreCameraModel = 1 * maxCoresAvailable = Unknown Type "j" (default) * maxMemoryAvailable = 18446744073709551615 (default) * output = "/tmp/tmpqc720hjp/CameraInit/961e54591174ec5a2457c66da8eadc0cb03d89ba/cameraInit.sfm" * rawColorInterpretation = Unknown Type "N11aliceVision5image23ERawColorInterpretationE" * sensorDatabase = "/home/kirill/Documents/TEMP/Meshroom-2023.3.0/aliceVision/share/aliceVision/cameraSensors.db" * verboseLevel = "info" * viewIdMethod = Unknown Type "N11aliceVision9sfmDataIO13EViewIdMethodE" * viewIdRegex = ".*?(\d+)" (default) Hardware : Detected core count : 16 OpenMP will use 16 cores Detected available memory : 22907 Mo [18:39:24.679150][warning] Some image(s) have no serial number to identify the camera/lens device. This makes it impossible to correctly group the images by device if you have used multiple identical (same model) camera devices. The reconstruction will assume that only one device has been used, so if 2 images share the same focal length approximation they will share the same internal camera parameters. 6 image(s) are concerned. [18:39:24.680864][info] CameraInit report: - # Views: 6 - # with focal length initialization (from metadata): 6 - # without metadata: 0 - # with DCP color calibration (raw images only): 0 - # with LCP lens distortion initialization: 0 - # with LCP vignetting calibration: 0 - # with LCP chromatic aberration correction models: 0 - # Cameras Intrinsics: 1 - commandLine: aliceVision_cameraInit --sensorDatabase "/home/kirill/Documents/TEMP/Meshroom-2023.3.0/aliceVision/share/aliceVision/cameraSensors.db" --lensCorrectionProfileInfo "${ALICEVISION_LENS_PROFILE_INFO}" --lensCorrectionProfileSearchIgnoreCameraModel True --defaultFieldOfView 45.0 --groupCameraFallback folder --allowedCameraModels pinhole,radial1,radial3,brown,fisheye4,fisheye1,3deanamorphic4,3deradial4,3declassicld --rawColorInterpretation LibRawWhiteBalancing --viewIdMethod metadata --verboseLevel info --output "/tmp/MeshroomCache/CameraInit/bb428f7e3a6bd00539b708742cebc70ef88e0959/cameraInit.sfm" --allowSingleView 1 --input "/tmp/MeshroomCache/CameraInit/bb428f7e3a6bd00539b708742cebc70ef88e0959/viewpoints.sfm" - logFile: /tmp/MeshroomCache/CameraInit/bb428f7e3a6bd00539b708742cebc70ef88e0959/log - commandLine: aliceVision_featureExtraction --input "/tmp/MeshroomCache/CameraInit/bb428f7e3a6bd00539b708742cebc70ef88e0959/cameraInit.sfm" --masksFolder "" --maskExtension png --maskInvert False --describerTypes dspsift --describerPreset normal --describerQuality normal --contrastFiltering GridSort --gridFiltering True --workingColorSpace sRGB --forceCpuExtraction True --maxThreads 0 --verboseLevel info --output "/tmp/MeshroomCache/FeatureExtraction/e88d80e29c967eb92d9bba3a2455bdd3a394abcf" --rangeStart 0 --rangeSize 40 - logFile: /tmp/MeshroomCache/FeatureExtraction/e88d80e29c967eb92d9bba3a2455bdd3a394abcf/0.log - commandLine: aliceVision_imageMatching --input "/tmp/MeshroomCache/CameraInit/bb428f7e3a6bd00539b708742cebc70ef88e0959/cameraInit.sfm" --featuresFolders "/tmp/MeshroomCache/FeatureExtraction/e88d80e29c967eb92d9bba3a2455bdd3a394abcf" --method SequentialAndVocabularyTree --tree "/home/kirill/Documents/TEMP/Meshroom-2023.3.0/aliceVision/share/aliceVision/vlfeat_K80L3.SIFT.tree" --weights "" --minNbImages 200 --maxDescriptors 500 --nbMatches 40 --nbNeighbors 5 --verboseLevel info --output "/tmp/MeshroomCache/ImageMatching/dd292711c8a1da3f82609a0570c4dba7fc4375e3/imageMatches.txt" - logFile: /tmp/MeshroomCache/ImageMatching/dd292711c8a1da3f82609a0570c4dba7fc4375e3/log - commandLine: aliceVision_featureMatching --input "/tmp/MeshroomCache/CameraInit/bb428f7e3a6bd00539b708742cebc70ef88e0959/cameraInit.sfm" --featuresFolders "/tmp/MeshroomCache/FeatureExtraction/e88d80e29c967eb92d9bba3a2455bdd3a394abcf" --imagePairsList "/tmp/MeshroomCache/ImageMatching/dd292711c8a1da3f82609a0570c4dba7fc4375e3/imageMatches.txt" --describerTypes dspsift --photometricMatchingMethod ANN_L2 --geometricEstimator acransac --geometricFilterType fundamental_matrix --distanceRatio 0.8 --maxIteration 2048 --geometricError 0.0 --knownPosesGeometricErrorMax 5.0 --minRequired2DMotion -1.0 --maxMatches 0 --savePutativeMatches False --crossMatching False --guidedMatching False --matchFromKnownCameraPoses False --exportDebugFiles False --verboseLevel info --output "/tmp/MeshroomCache/FeatureMatching/6ed03524dc412787370124dfaf4cad34335ca9b5" --rangeStart 0 --rangeSize 20 - logFile: /tmp/MeshroomCache/FeatureMatching/6ed03524dc412787370124dfaf4cad34335ca9b5/0.log - commandLine: aliceVision_incrementalSfM --input "/tmp/MeshroomCache/CameraInit/bb428f7e3a6bd00539b708742cebc70ef88e0959/cameraInit.sfm" --featuresFolders "/tmp/MeshroomCache/FeatureExtraction/e88d80e29c967eb92d9bba3a2455bdd3a394abcf" --matchesFolders "/tmp/MeshroomCache/FeatureMatching/6ed03524dc412787370124dfaf4cad34335ca9b5" --describerTypes dspsift --localizerEstimator acransac --observationConstraint Scale --localizerEstimatorMaxIterations 4096 --localizerEstimatorError 0.0 --lockScenePreviouslyReconstructed False --useLocalBA True --localBAGraphDistance 1 --nbFirstUnstableCameras 30 --maxImagesPerGroup 30 --bundleAdjustmentMaxOutliers 50 --maxNumberOfMatches 0 --minNumberOfMatches 0 --minInputTrackLength 2 --minNumberOfObservationsForTriangulation 2 --minAngleForTriangulation 3.0 --minAngleForLandmark 2.0 --maxReprojectionError 4.0 --minAngleInitialPair 5.0 --maxAngleInitialPair 40.0 --useOnlyMatchesFromInputFolder False --useRigConstraint True --rigMinNbCamerasForCalibration 20 --lockAllIntrinsics False --minNbCamerasToRefinePrincipalPoint 3 --filterTrackForks False --computeStructureColor True --useAutoTransform True --initialPairA "" --initialPairB "" --interFileExtension .abc --logIntermediateSteps False --verboseLevel info --output "/tmp/MeshroomCache/StructureFromMotion/ce258bfc0848af1364d5fff36cee60cb9f205072/sfm.abc" --outputViewsAndPoses "/tmp/MeshroomCache/StructureFromMotion/ce258bfc0848af1364d5fff36cee60cb9f205072/cameras.sfm" --extraInfoFolder "/tmp/MeshroomCache/StructureFromMotion/ce258bfc0848af1364d5fff36cee60cb9f205072" - logFile: /tmp/MeshroomCache/StructureFromMotion/ce258bfc0848af1364d5fff36cee60cb9f205072/log - commandLine: aliceVision_prepareDenseScene --input "/tmp/MeshroomCache/StructureFromMotion/ce258bfc0848af1364d5fff36cee60cb9f205072/sfm.abc" --maskExtension png --outputFileType exr --saveMetadata True --saveMatricesTxtFiles False --evCorrection False --verboseLevel info --output "/tmp/MeshroomCache/PrepareDenseScene/af9685e62368e56d724e93bd5361f942245bafd6" --rangeStart 0 --rangeSize 40 - logFile: /tmp/MeshroomCache/PrepareDenseScene/af9685e62368e56d724e93bd5361f942245bafd6/0.log - commandLine: aliceVision_depthMapEstimation --input "/tmp/MeshroomCache/StructureFromMotion/ce258bfc0848af1364d5fff36cee60cb9f205072/sfm.abc" --imagesFolder "/tmp/MeshroomCache/PrepareDenseScene/af9685e62368e56d724e93bd5361f942245bafd6" --downscale 2 --minViewAngle 2.0 --maxViewAngle 70.0 --tileBufferWidth 1024 --tileBufferHeight 1024 --tilePadding 64 --autoAdjustSmallImage True --chooseTCamsPerTile True --maxTCams 10 --sgmScale 2 --sgmStepXY 2 --sgmStepZ -1 --sgmMaxTCamsPerTile 4 --sgmWSH 4 --sgmUseSfmSeeds True --sgmSeedsRangeInflate 0.2 --sgmDepthThicknessInflate 0.0 --sgmMaxSimilarity 1.0 --sgmGammaC 5.5 --sgmGammaP 8.0 --sgmP1 10.0 --sgmP2Weighting 100.0 --sgmMaxDepths 1500 --sgmFilteringAxes "YX" --sgmDepthListPerTile True --sgmUseConsistentScale False --refineEnabled True --refineScale 1 --refineStepXY 1 --refineMaxTCamsPerTile 4 --refineSubsampling 10 --refineHalfNbDepths 15 --refineWSH 3 --refineSigma 15.0 --refineGammaC 15.5 --refineGammaP 8.0 --refineInterpolateMiddleDepth False --refineUseConsistentScale False --colorOptimizationEnabled True --colorOptimizationNbIterations 100 --sgmUseCustomPatchPattern False --refineUseCustomPatchPattern False --exportIntermediateDepthSimMaps False --exportIntermediateNormalMaps False --exportIntermediateVolumes False --exportIntermediateCrossVolumes False --exportIntermediateTopographicCutVolumes False --exportIntermediateVolume9pCsv False --exportTilePattern False --nbGPUs 0 --verboseLevel info --output "/tmp/MeshroomCache/DepthMap/67e0b9f51fd12a460eba61348519fee9e7d54a9f" --rangeStart 0 --rangeSize 12 - logFile: /tmp/MeshroomCache/DepthMap/67e0b9f51fd12a460eba61348519fee9e7d54a9f/0.log ERROR:root:Error on node computation: Error on node "DepthMap_1(0)": Log: [2024-02-23 18:39:52.392484] [0x00007fd902063000] [trace] Embedded OCIO configuration file: '/home/kirill/Documents/TEMP/Meshroom-2023.3.0/aliceVision/share/aliceVision/config.ocio' found. Program called with the following parameters: * autoAdjustSmallImage = 1 * chooseTCamsPerTile = 1 * colorOptimizationEnabled = 1 * colorOptimizationNbIterations = 100 * customPatchPatternGroupSubpartsPerLevel = 176 (default) * customPatchPatternSubparts = Unknown Type "St6vectorIN11aliceVision8depthMap24CustomPatchPatternParams13SubpartParamsESaIS3_EE" (default) * downscale = 2 * exportIntermediateCrossVolumes = 0 * exportIntermediateDepthSimMaps = 0 * exportIntermediateNormalMaps = 0 * exportIntermediateTopographicCutVolumes = 0 * exportIntermediateVolume9pCsv = 0 * exportIntermediateVolumes = 0 * exportTilePattern = 0 * imagesFolder = "/tmp/MeshroomCache/PrepareDenseScene/af9685e62368e56d724e93bd5361f942245bafd6" * input = "/tmp/MeshroomCache/StructureFromMotion/ce258bfc0848af1364d5fff36cee60cb9f205072/sfm.abc" * maxCoresAvailable = Unknown Type "j" (default) * maxMemoryAvailable = 18446744073709551615 (default) * maxTCams = 10 * maxViewAngle = 70 * minViewAngle = 2 * nbGPUs = 0 * output = "/tmp/MeshroomCache/DepthMap/67e0b9f51fd12a460eba61348519fee9e7d54a9f" * rangeSize = 12 * rangeStart = 0 * refineEnabled = 1 * refineGammaC = 15.5 * refineGammaP = 8 * refineHalfNbDepths = 15 * refineInterpolateMiddleDepth = 0 * refineMaxTCamsPerTile = 4 * refineScale = 1 * refineSigma = 15 * refineStepXY = 1 * refineSubsampling = 10 * refineUseConsistentScale = 0 * refineUseCustomPatchPattern = 0 * refineWSH = 3 * sgmDepthListPerTile = 1 * sgmDepthThicknessInflate = 0 * sgmFilteringAxes = "YX" * sgmGammaC = 5.5 * sgmGammaP = 8 * sgmMaxDepths = 1500 * sgmMaxSimilarity = 1 * sgmMaxTCamsPerTile = 4 * sgmP1 = 10 * sgmP2Weighting = 100 * sgmScale = 2 * sgmSeedsRangeInflate = 0.2 * sgmStepXY = 2 * sgmStepZ = -1 * sgmUseConsistentScale = 0 * sgmUseCustomPatchPattern = 0 * sgmUseSfmSeeds = 1 * sgmWSH = 4 * tileBufferHeight = 1024 * tileBufferWidth = 1024 * tilePadding = 64 * verboseLevel = "info" Hardware : Detected core count : 16 OpenMP will use 16 cores Detected available memory : 22705 Mo :3:rocdevice.cpp :442 : 12646352842 us: [pid:22097 tid:0x7fd902063000] Initializing HSA stack. :3:comgrctx.cpp :33 : 12646362663 us: [pid:22097 tid:0x7fd902063000] Loading COMGR library. :3:rocdevice.cpp :208 : 12646362767 us: [pid:22097 tid:0x7fd902063000] Numa selects cpu agent[0]=0x3353ed0(fine=0x33d5850,coarse=0x180b860) for gpu agent=0x334a3f0 CPU<->GPU XGMI=0 :3:rocdevice.cpp :1680: 12646362909 us: [pid:22097 tid:0x7fd902063000] Gfx Major/Minor/Stepping: 11/0/0 :3:rocdevice.cpp :1682: 12646362918 us: [pid:22097 tid:0x7fd902063000] HMM support: 1, XNACK: 0, Direct host access: 0 :3:rocdevice.cpp :1684: 12646362927 us: [pid:22097 tid:0x7fd902063000] Max SDMA Read Mask: 0x0, Max SDMA Write Mask: 0x0 :3:hip_context.cpp :48 : 12646364336 us: [pid:22097 tid:0x7fd902063000] Direct Dispatch: 1 :3:hip_context.cpp :153 : 12646364361 us: [pid:22097 tid:0x7fd902063000] hipInit ( 0 ) :3:hip_context.cpp :159 : 12646364367 us: [pid:22097 tid:0x7fd902063000] hipInit: Returned hipSuccess : :3:hip_device_runtime.cpp :546 : 12646364374 us: [pid:22097 tid:0x7fd902063000] hipGetDeviceCount ( 0x7ffd3850bbac ) :3:hip_device_runtime.cpp :548 : 12646364380 us: [pid:22097 tid:0x7fd902063000] hipGetDeviceCount: Returned hipSuccess : :3:hip_device.cpp :381 : 12646364388 us: [pid:22097 tid:0x7fd902063000] hipGetDeviceProperties ( 0x7ffd3850b458, 0 ) :3:hip_device.cpp :383 : 12646364396 us: [pid:22097 tid:0x7fd902063000] hipGetDeviceProperties: Returned hipSuccess : :3:hip_device_runtime.cpp :141 : 12646364405 us: [pid:22097 tid:0x7fd902063000] hipDeviceGetAttribute ( 0x7ffd3850b7ac, 87, 0 ) :3:hip_device_runtime.cpp :351 : 12646364412 us: [pid:22097 tid:0x7fd902063000] hipDeviceGetAttribute: Returned hipSuccess : :3:hip_device_runtime.cpp :546 : 12646364929 us: [pid:22097 tid:0x7fd902063000] hipGetDeviceCount ( 0x336b6c0 ) :3:hip_device_runtime.cpp :548 : 12646364936 us: [pid:22097 tid:0x7fd902063000] hipGetDeviceCount: Returned hipSuccess : :3:hip_device.cpp :169 : 12646364942 us: [pid:22097 tid:0x7fd902063000] hipDeviceGet ( 0x7ffd3850be48, 0 ) :3:hip_device.cpp :171 : 12646364946 us: [pid:22097 tid:0x7fd902063000] hipDeviceGet: Returned hipSuccess : :3:hip_device.cpp :237 : 12646364951 us: [pid:22097 tid:0x7fd902063000] hipDeviceGetName ( 0x3352928, 256, 0 ) :3:hip_device.cpp :257 : 12646364958 us: [pid:22097 tid:0x7fd902063000] hipDeviceGetName: Returned hipSuccess : :3:hip_device.cpp :176 : 12646364963 us: [pid:22097 tid:0x7fd902063000] hipDeviceTotalMem ( 0x3352a48, 0 ) :3:hip_device.cpp :191 : 12646364966 us: [pid:22097 tid:0x7fd902063000] hipDeviceTotalMem: Returned hipSuccess : :3:hip_device_runtime.cpp :141 : 12646364971 us: [pid:22097 tid:0x7fd902063000] hipDeviceGetAttribute ( 0x3352aac, 63, 0 ) :3:hip_device_runtime.cpp :351 : 12646364978 us: [pid:22097 tid:0x7fd902063000] hipDeviceGetAttribute: Returned hipSuccess : :3:hip_device_runtime.cpp :141 : 12646364982 us: [pid:22097 tid:0x7fd902063000] hipDeviceGetAttribute ( 0x3352ab0, 18, 0 ) :3:hip_device_runtime.cpp :351 : 12646364989 us: [pid:22097 tid:0x7fd902063000] hipDeviceGetAttribute: Returned hipSuccess : :3:hip_device_runtime.cpp :141 : 12646364993 us: [pid:22097 tid:0x7fd902063000] hipDeviceGetAttribute ( 0x3352ab4, 16, 0 ) :3:hip_device_runtime.cpp :351 : 12646364999 us: [pid:22097 tid:0x7fd902063000] hipDeviceGetAttribute: Returned hipSuccess : :3:hip_device_runtime.cpp :141 : 12646365004 us: [pid:22097 tid:0x7fd902063000] hipDeviceGetAttribute ( 0x3352ab8, 3, 0 ) :3:hip_device_runtime.cpp :351 : 12646365010 us: [pid:22097 tid:0x7fd902063000] hipDeviceGetAttribute: Returned hipSuccess : :3:hip_device_runtime.cpp :141 : 12646365014 us: [pid:22097 tid:0x7fd902063000] hipDeviceGetAttribute ( 0x3352ac0, 39, 0 ) :3:hip_device_runtime.cpp :351 : 12646365021 us: [pid:22097 tid:0x7fd902063000] hipDeviceGetAttribute: Returned hipSuccess : :3:hip_device_runtime.cpp :141 : 12646365025 us: [pid:22097 tid:0x7fd902063000] hipDeviceGetAttribute ( 0x3352ac4, 39, 0 ) :3:hip_device_runtime.cpp :351 : 12646365031 us: [pid:22097 tid:0x7fd902063000] hipDeviceGetAttribute: Returned hipSuccess : :3:hip_device.cpp :381 : 12646365036 us: [pid:22097 tid:0x7fd902063000] hipGetDeviceProperties ( 0x7ffd3850bad0, 0 ) :3:hip_device.cpp :383 : 12646365043 us: [pid:22097 tid:0x7fd902063000] hipGetDeviceProperties: Returned hipSuccess : :3:hip_device_runtime.cpp :141 : 12646365050 us: [pid:22097 tid:0x7fd902063000] hipDeviceGetAttribute ( 0x3352acc, 43, 0 ) :3:hip_device_runtime.cpp :351 : 12646365056 us: [pid:22097 tid:0x7fd902063000] hipDeviceGetAttribute: Returned hipSuccess : :3:hip_device_runtime.cpp :141 : 12646365060 us: [pid:22097 tid:0x7fd902063000] hipDeviceGetAttribute ( 0x3352ad0, 44, 0 ) :3:hip_device_runtime.cpp :351 : 12646365067 us: [pid:22097 tid:0x7fd902063000] hipDeviceGetAttribute: Returned hipSuccess : :3:hip_device_runtime.cpp :141 : 12646365071 us: [pid:22097 tid:0x7fd902063000] hipDeviceGetAttribute ( 0x3352ad4, 43, 0 ) :3:hip_device_runtime.cpp :351 : 12646365077 us: [pid:22097 tid:0x7fd902063000] hipDeviceGetAttribute: Returned hipSuccess : :3:hip_device_runtime.cpp :141 : 12646365081 us: [pid:22097 tid:0x7fd902063000] hipDeviceGetAttribute ( 0x3352ad8, 44, 0 ) :3:hip_device_runtime.cpp :351 : 12646365088 us: [pid:22097 tid:0x7fd902063000] hipDeviceGetAttribute: Returned hipSuccess : :3:hip_device_runtime.cpp :141 : 12646365092 us: [pid:22097 tid:0x7fd902063000] hipDeviceGetAttribute ( 0x3352af0, 49, 0 ) :3:hip_device_runtime.cpp :351 : 12646365104 us: [pid:22097 tid:0x7fd902063000] hipDeviceGetAttribute: Returned hipSuccess : :3:hip_device_runtime.cpp :141 : 12646365108 us: [pid:22097 tid:0x7fd902063000] hipDeviceGetAttribute ( 0x3352af4, 50, 0 ) :3:hip_device_runtime.cpp :351 : 12646365115 us: [pid:22097 tid:0x7fd902063000] hipDeviceGetAttribute: Returned hipSuccess : :3:hip_device_runtime.cpp :141 : 12646365119 us: [pid:22097 tid:0x7fd902063000] hipDeviceGetAttribute ( 0x3352af8, 51, 0 ) :3:hip_device_runtime.cpp :351 : 12646365125 us: [pid:22097 tid:0x7fd902063000] hipDeviceGetAttribute: Returned hipSuccess : :3:hip_device_runtime.cpp :141 : 12646365130 us: [pid:22097 tid:0x7fd902063000] hipDeviceGetAttribute ( 0x3352b0c, 39, 0 ) :3:hip_device_runtime.cpp :351 : 12646365136 us: [pid:22097 tid:0x7fd902063000] hipDeviceGetAttribute: Returned hipSuccess : :3:hip_device_runtime.cpp :141 : 12646365140 us: [pid:22097 tid:0x7fd902063000] hipDeviceGetAttribute ( 0x3352b14, 43, 0 ) :3:hip_device_runtime.cpp :351 : 12646365146 us: [pid:22097 tid:0x7fd902063000] hipDeviceGetAttribute: Returned hipSuccess : :3:hip_device_runtime.cpp :141 : 12646365150 us: [pid:22097 tid:0x7fd902063000] hipDeviceGetAttribute ( 0x3352b18, 44, 0 ) :3:hip_device_runtime.cpp :351 : 12646365157 us: [pid:22097 tid:0x7fd902063000] hipDeviceGetAttribute: Returned hipSuccess : :3:hip_device_runtime.cpp :141 : 12646365161 us: [pid:22097 tid:0x7fd902063000] hipDeviceGetAttribute ( 0x3352b28, 39, 0 ) :3:hip_device_runtime.cpp :351 : 12646365167 us: [pid:22097 tid:0x7fd902063000] hipDeviceGetAttribute: Returned hipSuccess : :3:hip_device_runtime.cpp :141 : 12646365171 us: [pid:22097 tid:0x7fd902063000] hipDeviceGetAttribute ( 0x3352b2c, 43, 0 ) :3:hip_device_runtime.cpp :351 : 12646365178 us: [pid:22097 tid:0x7fd902063000] hipDeviceGetAttribute: Returned hipSuccess : :3:hip_device_runtime.cpp :141 : 12646365182 us: [pid:22097 tid:0x7fd902063000] hipDeviceGetAttribute ( 0x3352b30, 44, 0 ) :3:hip_device_runtime.cpp :351 : 12646365188 us: [pid:22097 tid:0x7fd902063000] hipDeviceGetAttribute: Returned hipSuccess : :3:hip_device_runtime.cpp :141 : 12646365192 us: [pid:22097 tid:0x7fd902063000] hipDeviceGetAttribute ( 0x3352b34, 49, 0 ) :3:hip_device_runtime.cpp :351 : 12646365199 us: [pid:22097 tid:0x7fd902063000] hipDeviceGetAttribute: Returned hipSuccess : :3:hip_device_runtime.cpp :141 : 12646365203 us: [pid:22097 tid:0x7fd902063000] hipDeviceGetAttribute ( 0x3352b38, 50, 0 ) :3:hip_device_runtime.cpp :351 : 12646365209 us: [pid:22097 tid:0x7fd902063000] hipDeviceGetAttribute: Returned hipSuccess : :3:hip_device_runtime.cpp :141 : 12646365213 us: [pid:22097 tid:0x7fd902063000] hipDeviceGetAttribute ( 0x3352b3c, 51, 0 ) :3:hip_device_runtime.cpp :351 : 12646365220 us: [pid:22097 tid:0x7fd902063000] hipDeviceGetAttribute: Returned hipSuccess : :3:hip_device_runtime.cpp :141 : 12646365224 us: [pid:22097 tid:0x7fd902063000] hipDeviceGetAttribute ( 0x3352b40, 39, 0 ) :3:hip_device_runtime.cpp :351 : 12646365230 us: [pid:22097 tid:0x7fd902063000] hipDeviceGetAttribute: Returned hipSuccess : :3:hip_device_runtime.cpp :141 : 12646365234 us: [pid:22097 tid:0x7fd902063000] hipDeviceGetAttribute ( 0x3352b48, 43, 0 ) :3:hip_device_runtime.cpp :351 : 12646365241 us: [pid:22097 tid:0x7fd902063000] hipDeviceGetAttribute: Returned hipSuccess : :3:hip_device_runtime.cpp :141 : 12646365245 us: [pid:22097 tid:0x7fd902063000] hipDeviceGetAttribute ( 0x3352b4c, 44, 0 ) :3:hip_device_runtime.cpp :351 : 12646365251 us: [pid:22097 tid:0x7fd902063000] hipDeviceGetAttribute: Returned hipSuccess : :3:hip_device_runtime.cpp :141 : 12646365255 us: [pid:22097 tid:0x7fd902063000] hipDeviceGetAttribute ( 0x3352b68, 8, 0 ) :3:hip_device_runtime.cpp :351 : 12646365262 us: [pid:22097 tid:0x7fd902063000] hipDeviceGetAttribute: Returned hipSuccess : :3:hip_device_runtime.cpp :141 : 12646365270 us: [pid:22097 tid:0x7fd902063000] hipDeviceGetAttribute ( 0x3352b6c, 0, 0 ) :3:hip_device_runtime.cpp :351 : 12646365276 us: [pid:22097 tid:0x7fd902063000] hipDeviceGetAttribute: Returned hipSuccess : :3:hip_device_runtime.cpp :141 : 12646365281 us: [pid:22097 tid:0x7fd902063000] hipDeviceGetAttribute ( 0x3352b70, 67, 0 ) :3:hip_device_runtime.cpp :351 : 12646365287 us: [pid:22097 tid:0x7fd902063000] hipDeviceGetAttribute: Returned hipSuccess : :3:hip_device_runtime.cpp :141 : 12646365291 us: [pid:22097 tid:0x7fd902063000] hipDeviceGetAttribute ( 0x3352b74, 68, 0 ) :3:hip_device_runtime.cpp :351 : 12646365298 us: [pid:22097 tid:0x7fd902063000] hipDeviceGetAttribute: Returned hipSuccess : :3:hip_device.cpp :381 : 12646365302 us: [pid:22097 tid:0x7fd902063000] hipGetDeviceProperties ( 0x7ffd3850bad0, 0 ) :3:hip_device.cpp :383 : 12646365308 us: [pid:22097 tid:0x7fd902063000] hipGetDeviceProperties: Returned hipSuccess : :3:hip_device_runtime.cpp :141 : 12646365315 us: [pid:22097 tid:0x7fd902063000] hipDeviceGetAttribute ( 0x3352b88, 60, 0 ) :3:hip_device_runtime.cpp :351 : 12646365321 us: [pid:22097 tid:0x7fd902063000] hipDeviceGetAttribute: Returned hipSuccess : :3:hip_device_runtime.cpp :141 : 12646365325 us: [pid:22097 tid:0x7fd902063000] hipDeviceGetAttribute ( 0x3352b8c, 59, 0 ) :3:hip_device_runtime.cpp :351 : 12646365332 us: [pid:22097 tid:0x7fd902063000] hipDeviceGetAttribute: Returned hipSuccess : :3:hip_device_runtime.cpp :141 : 12646365336 us: [pid:22097 tid:0x7fd902063000] hipDeviceGetAttribute ( 0x3352b90, 19, 0 ) :3:hip_device_runtime.cpp :351 : 12646365342 us: [pid:22097 tid:0x7fd902063000] hipDeviceGetAttribute: Returned hipSuccess : :3:hip_device_runtime.cpp :141 : 12646365347 us: [pid:22097 tid:0x7fd902063000] hipDeviceGetAttribute ( 0x3352b98, 57, 0 ) :3:hip_device_runtime.cpp :351 : 12646365353 us: [pid:22097 tid:0x7fd902063000] hipDeviceGetAttribute: Returned hipSuccess : :3:hip_device_runtime.cpp :141 : 12646365357 us: [pid:22097 tid:0x7fd902063000] hipDeviceGetAttribute ( 0x7ffd3850be4c, 81, 0 ) :3:hip_device_runtime.cpp :351 : 12646365364 us: [pid:22097 tid:0x7fd902063000] hipDeviceGetAttribute: Returned hipSuccess : :3:hip_device_runtime.cpp :141 : 12646365368 us: [pid:22097 tid:0x7fd902063000] hipDeviceGetAttribute ( 0x7ffd3850be50, 82, 0 ) :3:hip_device_runtime.cpp :351 : 12646365374 us: [pid:22097 tid:0x7fd902063000] hipDeviceGetAttribute: Returned hipSuccess : :3:hip_device_runtime.cpp :141 : 12646365378 us: [pid:22097 tid:0x7fd902063000] hipDeviceGetAttribute ( 0x7ffd3850be54, 74, 0 ) :3:hip_device_runtime.cpp :351 : 12646365385 us: [pid:22097 tid:0x7fd902063000] hipDeviceGetAttribute: Returned hipSuccess : :3:hip_device_runtime.cpp :141 : 12646365389 us: [pid:22097 tid:0x7fd902063000] hipDeviceGetAttribute ( 0x7ffd3850be58, 74, 0 ) :3:hip_device_runtime.cpp :351 : 12646365395 us: [pid:22097 tid:0x7fd902063000] hipDeviceGetAttribute: Returned hipSuccess : :3:hip_device_runtime.cpp :141 : 12646365399 us: [pid:22097 tid:0x7fd902063000] hipDeviceGetAttribute ( 0x7ffd3850be5c, 10002, 0 ) :3:hip_device_runtime.cpp :351 : 12646365421 us: [pid:22097 tid:0x7fd902063000] hipDeviceGetAttribute: Returned hipSuccess : :3:hip_device_runtime.cpp :141 : 12646365425 us: [pid:22097 tid:0x7fd902063000] hipDeviceGetAttribute ( 0x3352a58, 71, 0 ) :3:hip_device_runtime.cpp :351 : 12646365432 us: [pid:22097 tid:0x7fd902063000] hipDeviceGetAttribute: Returned hipSuccess : :3:hip_device_runtime.cpp :141 : 12646365436 us: [pid:22097 tid:0x7fd902063000] hipDeviceGetAttribute ( 0x3352bb0, 71, 0 ) :3:hip_device_runtime.cpp :351 : 12646365443 us: [pid:22097 tid:0x7fd902063000] hipDeviceGetAttribute: Returned hipSuccess : :3:hip_device_runtime.cpp :141 : 12646365447 us: [pid:22097 tid:0x7fd902063000] hipDeviceGetAttribute ( 0x3352a5c, 87, 0 ) :3:hip_device_runtime.cpp :351 : 12646365458 us: [pid:22097 tid:0x7fd902063000] hipDeviceGetAttribute: Returned hipSuccess : :3:hip_device_runtime.cpp :141 : 12646365462 us: [pid:22097 tid:0x7fd902063000] hipDeviceGetAttribute ( 0x7ffd3850be60, 58, 0 ) :3:hip_device_runtime.cpp :351 : 12646365468 us: [pid:22097 tid:0x7fd902063000] hipDeviceGetAttribute: Returned hipSuccess : :3:hip_device_runtime.cpp :141 : 12646365473 us: [pid:22097 tid:0x7fd902063000] hipDeviceGetAttribute ( 0x3352a68, 56, 0 ) :3:hip_device_runtime.cpp :351 : 12646365479 us: [pid:22097 tid:0x7fd902063000] hipDeviceGetAttribute: Returned hipSuccess : :3:hip_device_runtime.cpp :141 : 12646365483 us: [pid:22097 tid:0x7fd902063000] hipDeviceGetAttribute ( 0x3352a6c, 26, 0 ) :3:hip_device_runtime.cpp :351 : 12646365489 us: [pid:22097 tid:0x7fd902063000] hipDeviceGetAttribute: Returned hipSuccess : :3:hip_device_runtime.cpp :141 : 12646365494 us: [pid:22097 tid:0x7fd902063000] hipDeviceGetAttribute ( 0x3352a70, 27, 0 ) :3:hip_device_runtime.cpp :351 : 12646365500 us: [pid:22097 tid:0x7fd902063000] hipDeviceGetAttribute: Returned hipSuccess : :3:hip_device_runtime.cpp :141 : 12646365504 us: [pid:22097 tid:0x7fd902063000] hipDeviceGetAttribute ( 0x3352a74, 28, 0 ) :3:hip_device_runtime.cpp :351 : 12646365511 us: [pid:22097 tid:0x7fd902063000] hipDeviceGetAttribute: Returned hipSuccess : :3:hip_device_runtime.cpp :141 : 12646365515 us: [pid:22097 tid:0x7fd902063000] hipDeviceGetAttribute ( 0x3352a78, 29, 0 ) :3:hip_device_runtime.cpp :351 : 12646365521 us: [pid:22097 tid:0x7fd902063000] hipDeviceGetAttribute: Returned hipSuccess : :3:hip_device_runtime.cpp :141 : 12646365525 us: [pid:22097 tid:0x7fd902063000] hipDeviceGetAttribute ( 0x3352a7c, 30, 0 ) :3:hip_device_runtime.cpp :351 : 12646365532 us: [pid:22097 tid:0x7fd902063000] hipDeviceGetAttribute: Returned hipSuccess : :3:hip_device_runtime.cpp :141 : 12646365536 us: [pid:22097 tid:0x7fd902063000] hipDeviceGetAttribute ( 0x3352a80, 31, 0 ) :3:hip_device_runtime.cpp :351 : 12646365542 us: [pid:22097 tid:0x7fd902063000] hipDeviceGetAttribute: Returned hipSuccess : :3:hip_device_runtime.cpp :141 : 12646365546 us: [pid:22097 tid:0x7fd902063000] hipDeviceGetAttribute ( 0x7ffd3850be64, 83, 0 ) :3:hip_device_runtime.cpp :351 : 12646365552 us: [pid:22097 tid:0x7fd902063000] hipDeviceGetAttribute: Returned hipSuccess : :3:hip_device_runtime.cpp :141 : 12646365557 us: [pid:22097 tid:0x7fd902063000] hipDeviceGetAttribute ( 0x3352a84, 5, 0 ) :3:hip_device_runtime.cpp :351 : 12646365563 us: [pid:22097 tid:0x7fd902063000] hipDeviceGetAttribute: Returned hipSuccess : :3:hip_device_runtime.cpp :141 : 12646365567 us: [pid:22097 tid:0x7fd902063000] hipDeviceGetAttribute ( 0x7ffd3850be68, 81, 0 ) :3:hip_device_runtime.cpp :351 : 12646365574 us: [pid:22097 tid:0x7fd902063000] hipDeviceGetAttribute: Returned hipSuccess : :3:hip_device_runtime.cpp :141 : 12646365578 us: [pid:22097 tid:0x7fd902063000] hipDeviceGetAttribute ( 0x3352bb4, 24, 0 ) :3:hip_device_runtime.cpp :351 : 12646365584 us: [pid:22097 tid:0x7fd902063000] hipDeviceGetAttribute: Returned hipSuccess : :3:hip_device_runtime.cpp :141 : 12646365588 us: [pid:22097 tid:0x7fd902063000] hipDeviceGetAttribute ( 0x3352bb8, 17, 0 ) :3:hip_device_runtime.cpp :351 : 12646365595 us: [pid:22097 tid:0x7fd902063000] hipDeviceGetAttribute: Returned hipSuccess : :3:hip_device_runtime.cpp :141 : 12646365599 us: [pid:22097 tid:0x7fd902063000] hipDeviceGetAttribute ( 0x3352bc8, 65, 0 ) :3:hip_device_runtime.cpp :351 : 12646365605 us: [pid:22097 tid:0x7fd902063000] hipDeviceGetAttribute: Returned hipSuccess : :3:hip_device_runtime.cpp :141 : 12646365609 us: [pid:22097 tid:0x7fd902063000] hipDeviceGetAttribute ( 0x3352bcc, 9, 0 ) :3:hip_device_runtime.cpp :351 : 12646365616 us: [pid:22097 tid:0x7fd902063000] hipDeviceGetAttribute: Returned hipSuccess : :3:hip_device_runtime.cpp :141 : 12646365624 us: [pid:22097 tid:0x7fd902063000] hipDeviceGetAttribute ( 0x3352bd8, 10, 0 ) :3:hip_device_runtime.cpp :351 : 12646365630 us: [pid:22097 tid:0x7fd902063000] hipDeviceGetAttribute: Returned hipSuccess : :3:hip_device_runtime.cpp :141 : 12646365634 us: [pid:22097 tid:0x7fd902063000] hipDeviceGetAttribute ( 0x3352bdc, 11, 0 ) :3:hip_device_runtime.cpp :351 : 12646365641 us: [pid:22097 tid:0x7fd902063000] hipDeviceGetAttribute: Returned hipSuccess : :3:hip_device_runtime.cpp :141 : 12646365645 us: [pid:22097 tid:0x7fd902063000] hipDeviceGetAttribute ( 0x3352be8, 66, 0 ) :3:hip_device_runtime.cpp :351 : 12646365651 us: [pid:22097 tid:0x7fd902063000] hipDeviceGetAttribute: Returned hipSuccess : :3:hip_device_runtime.cpp :141 : 12646365655 us: [pid:22097 tid:0x7fd902063000] hipDeviceGetAttribute ( 0x3352bec, 13, 0 ) :3:hip_device_runtime.cpp :351 : 12646365662 us: [pid:22097 tid:0x7fd902063000] hipDeviceGetAttribute: Returned hipSuccess : :3:hip_device.cpp :381 : 12646365666 us: [pid:22097 tid:0x7fd902063000] hipGetDeviceProperties ( 0x7ffd3850bad0, 0 ) :3:hip_device.cpp :383 : 12646365672 us: [pid:22097 tid:0x7fd902063000] hipGetDeviceProperties: Returned hipSuccess : :3:hip_device_runtime.cpp :546 : 12646365761 us: [pid:22097 tid:0x7fd902063000] hipGetDeviceCount ( 0x7ffd3850be54 ) :3:hip_device_runtime.cpp :548 : 12646365767 us: [pid:22097 tid:0x7fd902063000] hipGetDeviceCount: Returned hipSuccess : :3:hip_device_runtime.cpp :141 : 12646365772 us: [pid:22097 tid:0x7fd902063000] hipDeviceGetAttribute ( 0x7ffd3850bdc4, 67, 0 ) :3:hip_device_runtime.cpp :351 : 12646365779 us: [pid:22097 tid:0x7fd902063000] hipDeviceGetAttribute: Returned hipSuccess : :3:hip_device.cpp :381 : 12646365782 us: [pid:22097 tid:0x7fd902063000] hipGetDeviceProperties ( 0x7ffd3850ba70, 0 ) :3:hip_device.cpp :383 : 12646365789 us: [pid:22097 tid:0x7fd902063000] hipGetDeviceProperties: Returned hipSuccess : :3:hip_device_runtime.cpp :141 : 12646365796 us: [pid:22097 tid:0x7fd902063000] hipDeviceGetAttribute ( 0x7ffd3850bdc8, 68, 0 ) :3:hip_device_runtime.cpp :351 : 12646365802 us: [pid:22097 tid:0x7fd902063000] hipDeviceGetAttribute: Returned hipSuccess : :3:hip_device_runtime.cpp :141 : 12646365814 us: [pid:22097 tid:0x7fd902063000] hipDeviceGetAttribute ( 0x3352ab0, 18, 0 ) :3:hip_device_runtime.cpp :351 : 12646365820 us: [pid:22097 tid:0x7fd902063000] hipDeviceGetAttribute: Returned hipSuccess : :3:hip_device_runtime.cpp :141 : 12646365825 us: [pid:22097 tid:0x7fd902063000] hipDeviceGetAttribute ( 0x3352abc, 6, 0 ) :3:hip_device_runtime.cpp :351 : 12646365831 us: [pid:22097 tid:0x7fd902063000] hipDeviceGetAttribute: Returned hipSuccess : :3:hip_device_runtime.cpp :141 : 12646365835 us: [pid:22097 tid:0x7fd902063000] hipDeviceGetAttribute ( 0x3352a84, 5, 0 ) :3:hip_device_runtime.cpp :351 : 12646365842 us: [pid:22097 tid:0x7fd902063000] hipDeviceGetAttribute: Returned hipSuccess : :3:hip_device_runtime.cpp :141 : 12646365846 us: [pid:22097 tid:0x7fd902063000] hipDeviceGetAttribute ( 0x3352b88, 60, 0 ) :3:hip_device_runtime.cpp :351 : 12646365852 us: [pid:22097 tid:0x7fd902063000] hipDeviceGetAttribute: Returned hipSuccess : :3:hip_device_runtime.cpp :561 : 12646365860 us: [pid:22097 tid:0x7fd902063000] hipSetDevice ( 0 ) :3:hip_device_runtime.cpp :565 : 12646365864 us: [pid:22097 tid:0x7fd902063000] hipSetDevice: Returned hipSuccess : [18:39:56.853172][warning] Cannot get available memory information for CUDA gpu device 0: (error code: 999) cudaErrorUnknown [18:39:52.395440][warning] CUDA-Enabled GPU. Device information: - id: 0 - name: AMD Radeon RX 7900 XT [ZLUDA] - compute capability: 8.8 - clock frequency (kHz): 2025000 - total device memory: 20464 MB - device memory available: 0 MB - per-block shared memory: 65536 - warp size: 32 - max threads per block: 1024 - max threads per SM(X): 2048 - max block sizes: {1024,1024,1024} - max grid sizes: {2147483647,65536,65536} - max 2D array texture: {16384,16384} - max 3D array texture: {16384,16384,8192} - max 2D linear texture: {0,0,0} - max 2D layered texture: {16384,16384,65535} - number of SM(x)s: 42 - registers per SM(x): 65536 - registers per block: 65536 - concurrent kernels: yes - mapping host memory: yes - unified addressing: yes - texture alignment: 256 byte - pitch alignment: 256 byte CUDA build version: 11.3 :3:hip_device_runtime.cpp :141 : 12650807282 us: [pid:22097 tid:0x7fd902063000] hipDeviceGetAttribute ( 0x3352ab0, 18, 0 ) :3:hip_device_runtime.cpp :351 : 12650807294 us: [pid:22097 tid:0x7fd902063000] hipDeviceGetAttribute: Returned hipSuccess : :3:hip_device_runtime.cpp :141 : 12650807298 us: [pid:22097 tid:0x7fd902063000] hipDeviceGetAttribute ( 0x3352abc, 6, 0 ) :3:hip_device_runtime.cpp :351 : 12650807305 us: [pid:22097 tid:0x7fd902063000] hipDeviceGetAttribute: Returned hipSuccess : :3:hip_device_runtime.cpp :141 : 12650807310 us: [pid:22097 tid:0x7fd902063000] hipDeviceGetAttribute ( 0x3352a84, 5, 0 ) :3:hip_device_runtime.cpp :351 : 12650807316 us: [pid:22097 tid:0x7fd902063000] hipDeviceGetAttribute: Returned hipSuccess : :3:hip_device_runtime.cpp :141 : 12650807321 us: [pid:22097 tid:0x7fd902063000] hipDeviceGetAttribute ( 0x3352b88, 60, 0 ) :3:hip_device_runtime.cpp :351 : 12650807327 us: [pid:22097 tid:0x7fd902063000] hipDeviceGetAttribute: Returned hipSuccess : [18:39:56.853393][info] Supported CUDA-Enabled GPU detected. [18:39:56.856167][info] Found 1 image dimension(s): [18:39:56.856194][info] - [4032x3024] [18:39:56.860944][info] Overall maximum dimension: [4032x3024] [18:39:56.860995][info] Tiling information: - parameters: - buffer width: 1024 px - buffer height: 1024 px - padding: 64 px - maximum downscale: 4 - maximum image width: 2016 px - maximum image height: 1512 px - maximum effective tile width: 896 px - maximum effective tile height: 896 px - # tiles on X-side: 3 - # tiles on Y-side: 2 - effective tile width: 672 px - effective tile height: 756 px - tile list: - tile (1/6) size: 736x820 px, roi: [x: 0-736, y: 0-820] - tile (2/6) size: 736x756 px, roi: [x: 0-736, y: 756-1512] - tile (3/6) size: 736x820 px, roi: [x: 672-1408, y: 0-820] - tile (4/6) size: 736x756 px, roi: [x: 672-1408, y: 756-1512] - tile (5/6) size: 672x820 px, roi: [x: 1344-2016, y: 0-820] - tile (6/6) size: 672x756 px, roi: [x: 1344-2016, y: 756-1512] [18:39:56.861012][info] SGM parameters: - scale: 2 - stepXY: 2 [18:39:56.861036][info] Refine parameters: - scale: 1 - stepXY: 1 :3:hip_device_runtime.cpp :141 : 12650814999 us: [pid:22097 tid:0x7fd902063000] hipDeviceGetAttribute ( 0x3352ab0, 18, 0 ) :3:hip_device_runtime.cpp :351 : 12650815010 us: [pid:22097 tid:0x7fd902063000] hipDeviceGetAttribute: Returned hipSuccess : :3:hip_device_runtime.cpp :141 : 12650815015 us: [pid:22097 tid:0x7fd902063000] hipDeviceGetAttribute ( 0x3352abc, 6, 0 ) :3:hip_device_runtime.cpp :351 : 12650815022 us: [pid:22097 tid:0x7fd902063000] hipDeviceGetAttribute: Returned hipSuccess : :3:hip_device_runtime.cpp :141 : 12650815026 us: [pid:22097 tid:0x7fd902063000] hipDeviceGetAttribute ( 0x3352a84, 5, 0 ) :3:hip_device_runtime.cpp :351 : 12650815033 us: [pid:22097 tid:0x7fd902063000] hipDeviceGetAttribute: Returned hipSuccess : :3:hip_device_runtime.cpp :141 : 12650815037 us: [pid:22097 tid:0x7fd902063000] hipDeviceGetAttribute ( 0x3352b88, 60, 0 ) :3:hip_device_runtime.cpp :351 : 12650815044 us: [pid:22097 tid:0x7fd902063000] hipDeviceGetAttribute: Returned hipSuccess : [18:39:56.861115][info] Number of GPU devices: 1, number of CPU threads: 16 :3:hip_device_runtime.cpp :561 : 12650815074 us: [pid:22097 tid:0x7fd902063000] hipSetDevice ( 0 ) :3:hip_device_runtime.cpp :565 : 12650815078 us: [pid:22097 tid:0x7fd902063000] hipSetDevice: Returned hipSuccess : [18:39:56.864274][info] Found only 4/10 nearest cameras for view id: 15928501 [18:39:56.873696][info] Found only 3/10 nearest cameras for view id: 433314061 [18:39:56.882159][info] Found only 5/10 nearest cameras for view id: 1164982389 [18:39:56.893128][info] Found only 4/10 nearest cameras for view id: 1254707300 [18:39:56.904865][info] Found only 5/10 nearest cameras for view id: 1719451455 [18:39:56.917447][info] Found only 5/10 nearest cameras for view id: 2108716370 ================================================================================ ====================== Command line failed with an error ======================= Could not allocate pinned host memory in /opt/AliceVision_git/src/aliceVision/depthMap/cuda/host/memory.hpp:395, unknown error: unknown error ================================================================================ WARNING:root:Downgrade status on node "Texturing_1" from Status.SUBMITTED to Status.NONE WARNING:root:Downgrade status on node "DepthMapFilter_1(0)" from Status.SUBMITTED to Status.NONE WARNING:root:Downgrade status on node "Meshing_1" from Status.SUBMITTED to Status.NONE WARNING:root:Downgrade status on node "MeshFiltering_1" from Status.SUBMITTED to Status.NONE ```

Then i've tried running meshroom with ltrace, but the program freezes and doesn't respond at all. There is an output in my terminal, so i'll attach it too. Suddenly, it started working. I'm attaching the logs.

ltrace log was too long for github, check it [here](https://pastebin.com/cW56EGCY)

So we get a bunch of 500 (CUDA_ERROR_NOT_FOUND) errors from cuGetProcAddress(). I would like to know what does that mean in this context. Maybe @vosen as a mantainer can help?

P.S. I've used monstree-mini6 dataset to test it.

vosen commented 8 months ago

cuGetProcAddress returning 500 is benign pretty much always. There's a handful of CUDA driver functions which are either too esoteric (nvsci) or just difficult to support cleanly (OS-specific interop: DirectX, EGL) where ZLUDA does not even advertise possibility of supporting them.

It's not int the logs, but I infer that ZLUDA fails to compile a PTX module at startup, which usually is not hard to fix. I'll have a look

vosen commented 7 months ago

I was wrong, this was suffering.

Screenshot 2024-03-26 030047

Meshroom seems to work now. Since it is a rather big change and I don't really know how to use Meshroom I'd like someone who is interested in this issue to give it a try:

I can provide the binaries, but I'd prefer if someone built from scratch to double check everything. If you have questions please ask.

natowi commented 7 months ago

It would be great if you could share the binaries for testing :)

vosen commented 7 months ago

ZLUDA: https://files.catbox.moe/pba9kq.zip AliceVision: https://gofile.io/d/ITFhF0

gerberger commented 6 months ago

I I've tried running the full and mini6 image set from https://github.com/alicevision/dataset_monstree, using the binaries shared by Natowi at https://github.com/alicevision/Meshroom/issues/595, but when it gets to the DepthMap scene, it fails, with nothing in the log. I've tried with the log verbosity set to various levels. cmd prompt has this: ERROR:root:Error on node computation: Error on node "DepthMap_1(0)": Log:

WARNING:root:Downgrade status on node "DepthMapFilter_1(0)" from Status.SUBMITTED to Status.NONE WARNING:root:Downgrade status on node "Meshing_1" from Status.SUBMITTED to Status.NONE WARNING:root:Downgrade status on node "MeshFiltering_1" from Status.SUBMITTED to Status.NONE WARNING:root:Downgrade status on node "Texturing_1" from Status.SUBMITTED to Status.NONE

I'm running an AMD Ryzen 5 3600, with an RX 5700XT, on Windows 11 23H2, with Adrenalin 24.4.1

edit: Feature extraction using sift, and unticking force cpu works fine

aesxsc commented 5 months ago

I did everything @gerberger did, and it still fails with the same error. Also this error pops up image Adrenalin 24.5.1, RX 5500XT

bonnyr commented 3 months ago

I can confirm this is still happening although I do not get that popup. Same output, no logs... All stages until that point run fine.

This is with a scene using my own images, likely not to be producing useful model, but not the known image set mentioned above.

The full, mini3 and mini6 monster scene fails in the same way at the same point

Adrenalin 24.6.1, RX 6750 XT

Tried to run the same command directly under zluda, but no dice

Would like to know if there are more logs or any other way I can get more info out

dngulin commented 2 months ago

It works only on Windows, because of an underlying ROCm issue

Can you please clarify what is the ROCm issue? I wonder if it is possible to run Meshroom + ZLUDA on linux.

vosen commented 2 months ago

@dngulin for some reason AMD enabled support for mipmapped arrays (textures) on Windows ROCm, but not on Linux ROCm. Meshroom relies on this functionality

bonnyr-f5 commented 2 months ago

But why is this happening on windows for me? Or is this a different issue?

@dngulin for some reason AMD enabled support for mipmapped arrays (textures) on Windows ROCm, but not on Linux ROCm. Meshroom relies on this functionality

Hiuzuki commented 1 month ago

Hello, does this method still work? I tried to use Meshroom with my RX 5500 XT using Zluda, I get an error when I get to Depthmap, it seems that Meshroom doesn't even detect the GPU, only the CPU. I installed the CUDA toolkit but it didn't work, I even tried to install the Nvidia graphics driver, but obviously it didn't work.

vosen commented 1 month ago

@aesxsc @bonnyr-f5 I guess your error probably comes from DepthMap using CUDA Runtime, but not bundling it with the app. Extracting it from CUDA installer and dropping it in a path where DepthMap looks for .dll would make the error go away ~but still won't make DepthMap work because of aforementioned ROCm Windows issue~

Edit: Sorry, I got this wrong way around. It's Linux ROCm that is broken, not Windows one

mpaterakis commented 5 hours ago

To anyone still having crashes: Use CUDA 11.0, add the installation path (C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v11.0\bin) to your system's Path, and use this pre-compiled package by natowi.

Works on my 6750 XT.