meriken / merikens-tripcode-engine-v3

"Meriken's Tripcode Engine" is a cross-platform custom tripcode generator.
GNU General Public License v3.0
115 stars 18 forks source link

Can't get this working on AWS P2 instance types #2

Open ghost opened 7 years ago

ghost commented 7 years ago

Just compiled this on an p2.xlarge instance with the official NVIDIA image, and the Ubuntu 16.04 image.

Both give me the following result:

ERROR
=====
  A corrupt tripcode was generated.
  The hardware or device driver may be malfunctioning.
  Please check the temperatures of CPU(s) and GPU(s).

I've tried building both the regular and NVIDIA optimized version.

Any thoughts?

ghost commented 7 years ago

Using -c this error does not occur, so it is definitely related to the GPU functionality. Passing -g works for a while, leads to tripcodes being found, but then moments later the application exits with the above error message. The --disable-gcn-assembler flag does not affect the result.

ghost commented 7 years ago

For your information, mty_cl does work, finds about 120Mtrip/s on this instance type. merikens-tripcode-engine finds about 40Mtrip/s until it crashes.

Interesting to note that merikens-tripcode-engine seems to work fine on the g2 instance types, so perhaps related to the NVIDIA K80s?

Willian-Zhang commented 5 years ago

Same here with p2.xlarge instance and Deep Learning Base AMI (Ubuntu) image

build cmd

./BuildAll.sh --enable-cuda --install

run log

Enabled Features: OpenCL/GCN CUDA SSE2/AVX/AVX2(x86_64)

CUDA DEVICE
===========
  Device No.:               0
  Device Name:              Tesla K80
  Multiprocessor Count:     13
  Clock Rate:               824MHz
  Compute Capability:       3.7
  Compute Mode:             cudaComputeModeDefault

PATTERN(S)
==========
  0: "(hidden)" (regex)
  1: "(hidden)" (regex)
  2: "(hidden)" (regex)

TRIPCODES
=========

STATUS
======
  Performing a forward-matching search on GPU(s)
  for 2 patterns (2 chunks) with 6 to 8 characters:
      CUDA0-0:     23.5M TPS, 128 blocks/SM
      CUDA0-1:     23.7M TPS, 128 blocks/SM

  0.001T tripcodes were generated in 0d 0h 0m 20s at:
      38.17M tripcode/s (current)
      38.17M tripcode/s (average)
  On average, it takes 20.8 minutes to find one match at this speed.

  No matches were found yet.

ERROR
=====
  A corrupt tripcode was generated.
  The hardware or device driver may be malfunctioning.
  Please check the temperatures of CPU(s) and GPU(s).