opentensor / cubit

A Cython library to solve the Bittensor registration POW on CUDA
MIT License
11 stars 9 forks source link

Issue with cubit install #1

Closed camfairchild closed 2 years ago

camfairchild commented 2 years ago

@juansolana

I'm trying to run a miner following the documentation. I have already created my coldkey and my hotkey, installed cuda and checked that is available through console, and also installed cubit as described and verified that I'm able to import it. But when I try to run btcli run --cuda to start registration, an error is thrown.

The error: /home/jj/bittensor/bittensor/cubit.pyx:164 in cubit.solve_cuda [Errno 2] No such file or directory: '/home/jj/bittensor/bittensor/cubit.pyx' TypeError: an integer is required ⠸ Solving%

Environment:

Linux Ubuntu 18.04 Python 3.9bit Bittensor Version pulled from master

camfairchild commented 2 years ago

@juansolana Where is cubit installed? Did you use the source install or a wheel?

juansolana commented 2 years ago

I used the wheel for python3 .9. I'm using a conda environment and have it installed at '/home/js/anaconda3/envs/bittensor/lib/python3.9/site-packages/cubit.cpython-39-x86_64-linux-gnu.so'

camfairchild commented 2 years ago

Can you try

I used the wheel for python3 .9. I'm using a conda environment and have it installed at '/home/js/anaconda3/envs/bittensor/lib/python3.9/site-packages/cubit.cpython-39-x86_64-linux-gnu.so'

Can you try installing from source?

juansolana commented 2 years ago

Sure, I uninstalled cubit and then tried installing from source but failed to install with this error:

(bittensor) ➜  cubit git:(master) pip install -e .
Obtaining file:///home/js/cubit
  Installing build dependencies ... done
  Checking if build backend supports build_editable ... done
  Getting requirements to build editable ... error
  error: subprocess-exited-with-error

  × Getting requirements to build editable did not run successfully.
  │ exit code: 1
  ╰─> [18 lines of output]
      Traceback (most recent call last):
        File "/home/js/anaconda3/envs/bittensor/lib/python3.9/site-packages/pip/_vendor/pep517/in_process/_in_process.py", line 363, in <module>
          main()
        File "/home/js/anaconda3/envs/bittensor/lib/python3.9/site-packages/pip/_vendor/pep517/in_process/_in_process.py", line 345, in main
          json_out['return_val'] = hook(**hook_input['kwargs'])
        File "/home/js/anaconda3/envs/bittensor/lib/python3.9/site-packages/pip/_vendor/pep517/in_process/_in_process.py", line 144, in get_requires_for_build_editable
          return hook(config_settings)
        File "/tmp/pip-build-env-7rlxed_2/overlay/lib/python3.9/site-packages/setuptools/build_meta.py", line 445, in get_requires_for_build_editable
          return self.get_requires_for_build_wheel(config_settings)
        File "/tmp/pip-build-env-7rlxed_2/overlay/lib/python3.9/site-packages/setuptools/build_meta.py", line 338, in get_requires_for_build_wheel
          return self._get_build_requires(config_settings, requirements=['wheel'])
        File "/tmp/pip-build-env-7rlxed_2/overlay/lib/python3.9/site-packages/setuptools/build_meta.py", line 320, in _get_build_requires
          self.run_setup()
        File "/tmp/pip-build-env-7rlxed_2/overlay/lib/python3.9/site-packages/setuptools/build_meta.py", line 335, in run_setup
          exec(code, locals())
        File "<string>", line 124, in <module>
        File "<string>", line 60, in locate_cuda
      OSError: The nvcc binary could not be located in your $PATH. Either add it to your path, or set $CUDA_HOME
      [end of output]

  note: This error originates from a subprocess, and is likely not a problem with pip.
error: subprocess-exited-with-error

× Getting requirements to build editable did not run successfully.
│ exit code: 1
╰─> See above for output.

note: This error originates from a subprocess, and is likely not a problem with pip.
camfairchild commented 2 years ago

      OSError: The nvcc binary could not be located in your $PATH. Either add it to your path, or set $CUDA_HOME
      [end of output] ```

you require nvcc to compile

juansolana commented 2 years ago

Right! I could install it from source now, but still getting an error with cubit when I ran btcli run --cuda. But I get a different error if I run the command from ~/cubit or ~/bittensor/bittensor, should it matter?

When I run from ~/cubit I get this:

/home/js/cubit/cubit.pyx:168 in cubit.solve_cuda                                                 │
│                                                                                                  │
│   165 │   finally:                                                                               │
│   166 │   │   PyMem_Free(preseal_bytes)                                                          │
│   167    
cpdef int128 solve_cuda(int blockSize, uint64 nonce_start, uint64 update_interval, const   │
│   169 │   cdef uint64 solution                                                                   │
│   170 │   cdef int128 solution_128                                                               │
│   171                                                                                            │
│                                                                                                  │
│ ╭─────────────────────────────────────────── locals ───────────────────────────────────────────╮ │
│ │                   __builtins__ = <module 'builtins' (built-in)>                              │ │
│ │                        __doc__ = None                                                        │ │
│ │                       __file__ = '/home/js/cubit/cubit.cpython-39-x86_64-linux-gnu.so'       │ │
│ │                     __loader__ = <_frozen_importlib_external.ExtensionFileLoader object at   │ │
│ │                                  0x7f0b659f1760>                                             │ │
│ │                       __name__ = 'cubit'                                                     │ │
│ │                    __package__ = ''                                                          │ │
│ │            __pyx_unpickle_Enum = <built-in function __pyx_unpickle_Enum>                     │ │
│ │                       __spec__ = ModuleSpec(name='cubit',                                    │ │
│ │                                  loader=<_frozen_importlib_external.ExtensionFileLoader      │ │
│ │                                  object at 0x7f0b659f1760>,                                  │ │
│ │                                  origin='/home/js/cubit/cubit.cpython-39-x86_64-linux-gnu.s… │ │
│ │                       __test__ = {}                                                          │ │
│ │                          array = <module 'array' from                                        │ │
│ │                                  '/home/js/anaconda3/envs/bittensor/lib/python3.9/lib-dynlo… │ │
│ │                log_cuda_errors = <built-in function log_cuda_errors>                         │ │
│ │                     reset_cuda = <built-in function reset_cuda>                              │ │
│ │                       run_test = <built-in function run_test>                                │ │
│ │    run_test_create_nonce_bytes = <built-in function run_test_create_nonce_bytes>             │ │
│ │       run_test_create_pre_seal = <built-in function run_test_create_pre_seal>                │ │
│ │                run_test_keccak = <built-in function run_test_keccak>                         │ │
│ │             run_test_less_than = <built-in function run_test_less_than>                      │ │
│ │          run_test_preseal_hash = <built-in function run_test_preseal_hash>                   │ │
│ │             run_test_seal_hash = <built-in function run_test_seal_hash>                      │ │
│ │ run_test_seal_meets_difficulty = <built-in function run_test_seal_meets_difficulty>          │ │
│ │                     solve_cuda = <built-in function solve_cuda>                              │ │
│ ╰──────────────────────────────────────────────────────────────────────────────────────────────╯ │
╰──────────────────────────────────────────────────────────────────────────────────────────────────╯
TypeError: an integer is required

If I run it from ~/bittensor/bittensor I get this:

/home/js/bittensor/bittensor/utils/register_cuda.py:66 in solve_cuda                             │
│                                                                                                  │
│   63 │   # Call cython function                                                                  │
│   64 │   # int blockSize, uint64 nonce_start, uint64 update_interval, const unsigned char[:]     │
│   65 │   # const unsigned char[:] block_bytes, int dev_id                                        │
│ ❱ 66 │   solution = cubit.solve_cuda(TPB, nonce_start, update_interval, upper_bytes, block_by    │
│   67 │   seal = None                                                                             │
│   68 │   if solution != -1:                                                                      │
│   69 │   │   print(f"Checking solution: {solution} for bn: {bn}")                                │
│                                                                                                  │
│ ╭─────────────────────────────────────────── locals ───────────────────────────────────────────╮ │
│ │           block_bytes = b'caaff68b14f02be1b0f19f68aa963c818003cabb218859276d5723d06435731c'  │ │
│ │                    bn = 2210754                                                              │ │
│ │      create_seal_hash = <function solve_cuda.<locals>.create_seal_hash at 0x7fc136af0e50>    │ │
│ │                 cubit = <module 'cubit' from                                                 │ │
│ │                         '/home/js/cubit/cubit.cpython-39-x86_64-linux-gnu.so'>               │ │
│ │                dev_id = None                                                                 │ │
│ │            difficulty = 327680000000                                                         │ │
│ │  hex_bytes_to_u8_list = <function solve_cuda.<locals>.hex_bytes_to_u8_list at                │ │
│ │                         0x7fc137692700>                                                      │ │
│ │                 limit = 1157920892373161954235709850086879078532699846656405640394575840079… │ │
│ │           nonce_start = 0                                                                    │ │
│ │ seal_meets_difficulty = <function solve_cuda.<locals>.seal_meets_difficulty at               │ │
│ │                         0x7fc1376d3dc0>                                                      │ │
│ │                   TPB = 256                                                                  │ │
│ │       update_interval = 50000                                                                │ │
│ │                 upper = 353369412955676865916659500148583703165496779375123791624321240258   │ │
│ │           upper_bytes = b'\xc2\x0c2n\x0f\xe7\x86K3\x8222m\xa4\x11\xd8A\xca\xf4\xf0\n\xe9\x9… │ │
│ ╰──────────────────────────────────────────────────────────────────────────────────────────────╯ │
│                                                                                                  │
│ /home/js/bittensor/bittensor/cubit.pyx:168 in cubit.solve_cuda                                   │
│                                                                                                  │
│ [Errno 2] No such file or directory: '/home/js/bittensor/bittensor/cubit.pyx'                    │
╰──────────────────────────────────────────────────────────────────────────────────────────────────╯
TypeError: an integer is required
camfairchild commented 2 years ago

I believe this is the pythonpath for cubit is for some reason relative. If you can you should modify it to be absolute

camfairchild commented 2 years ago

if using conda I would nuke the env or try to uninstall cubit

juansolana commented 2 years ago

I nuked the env and created a new one using python 3.9 again. I installed bittensor using /bin/bash -c "$(curl -fsSL https://raw.githubusercontent.com/opentensor/bittensor/master/scripts/install.sh)" according to this part of the documentation.

Then I installed cubit using pip install https://github.com/opentensor/cubit/releases/download/v1.0.5/cubit-1.0.5-cp39-cp39-linux_x86_64.whl because using pip install -e .[cubit] inside the bittensor/bittensor directory wouldn't work as the documentation says.

When I ran btcli run --cuda I got the same error when running from ~/cubit:

│ /home/js/cubit/cubit.pyx:168 in cubit.solve_cuda                                                 │
│                                                                                                  │
│   165 │   finally:                                                                               │
│   166 │   │   PyMem_Free(preseal_bytes)                                                          │
│   167                                                                                            │
│ ❱ 168 cpdef int128 solve_cuda(int blockSize, uint64 nonce_start, uint64 update_interval, const   │
│   169 │   cdef uint64 solution                                                                   │
│   170 │   cdef int128 solution_128                                                               │
│   171                                                                                            │
│                                                                                                  │
│ ╭─────────────────────────────────────────── locals ───────────────────────────────────────────╮ │
│ │                   __builtins__ = <module 'builtins' (built-in)>                              │ │
│ │                        __doc__ = None                                                        │ │
│ │                       __file__ = '/home/js/cubit/cubit.cpython-39-x86_64-linux-gnu.so'       │ │
│ │                     __loader__ = <_frozen_importlib_external.ExtensionFileLoader object at   │ │
│ │                                  0x7f59c41f9280>                                             │ │
│ │                       __name__ = 'cubit'                                                     │ │
│ │                    __package__ = ''                                                          │ │
│ │            __pyx_unpickle_Enum = <built-in function __pyx_unpickle_Enum>                     │ │
│ │                       __spec__ = ModuleSpec(name='cubit',                                    │ │
│ │                                  loader=<_frozen_importlib_external.ExtensionFileLoader      │ │
│ │                                  object at 0x7f59c41f9280>,                                  │ │
│ │                                  origin='/home/js/cubit/cubit.cpython-39-x86_64-linux-gnu.s… │ │
│ │                       __test__ = {}                                                          │ │
│ │                          array = <module 'array' from                                        │ │
│ │                                  '/home/js/anaconda3/envs/bittensor/lib/python3.9/lib-dynlo… │ │
│ │                log_cuda_errors = <built-in function log_cuda_errors>                         │ │
│ │                     reset_cuda = <built-in function reset_cuda>                              │ │
│ │                       run_test = <built-in function run_test>                                │ │
│ │    run_test_create_nonce_bytes = <built-in function run_test_create_nonce_bytes>             │ │
│ │       run_test_create_pre_seal = <built-in function run_test_create_pre_seal>                │ │
│ │                run_test_keccak = <built-in function run_test_keccak>                         │ │
│ │             run_test_less_than = <built-in function run_test_less_than>                      │ │
│ │          run_test_preseal_hash = <built-in function run_test_preseal_hash>                   │ │
│ │             run_test_seal_hash = <built-in function run_test_seal_hash>                      │ │
│ │ run_test_seal_meets_difficulty = <built-in function run_test_seal_meets_difficulty>          │ │
│ │                     solve_cuda = <built-in function solve_cuda>                              │ │
│ ╰──────────────────────────────────────────────────────────────────────────────────────────────╯ │
╰──────────────────────────────────────────────────────────────────────────────────────────────────╯
TypeError: an integer is required

So I uninstalled cubit and then isntalled again from source running pip install -e . in the ~/cubit directory. Ran btcli run --cuda but unfortunately got the same result.

Running that last command but on ~/bittensor/bittensor directory would throw the same error mentioned before where it can find the cubit.pyx file, I didn't try fixing the path since I'm guessing if that got solved it may throw the same error I see when running from ~/cubit.

camfairchild commented 2 years ago

Try to install cubit from source first in a fresh env and then see if you can call the solve_cuda function from it before installing bittensor.

juansolana commented 2 years ago

Ok, I nuked the environment and made it again with python3.9, installed bittensor as per the documentation and then installed cubit from source. Then I tried calling solve_cuda the following way, which I think is how it is called when it fails:

solve_cuda(2216467, 0, 50000, 1157920892373161954235709850086879078532699846656405640394575840079, b'8dae521574ab9f138793763c8b88e1db28f02ae315c11a6384b2af4f7f0db0ce', None)

Although I'm not sure of the 4th argument, which is limit, it's a very big number that doesn't fit my terminal so couldn't copy it completely, anyway, after calling with this it fails with the limit value with this error:

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "cubit.pyx", line 168, in cubit.solve_cuda
    cpdef int128 solve_cuda(int blockSize, uint64 nonce_start, uint64 update_interval, const unsigned char[:] limit, const unsigned char[:] block_bytes, const int dev_id):
  File "stringsource", line 660, in View.MemoryView.memoryview_cwrapper
  File "stringsource", line 350, in View.MemoryView.memoryview.__cinit__
TypeError: a bytes-like object is required, not 'int'

So I changed limit to the same value as block_bytes, probably not ok but it solved that error and then failed on the dev_id value when I called it like this:

solve_cuda(2216467, 0, 50000, b'8dae521574ab9f138793763c8b88e1db28f02ae315c11a6384b2af4f7f0db0ce', b'8dae521574ab9f138793763c8b88e1db28f02ae315c11a6384b2af4f7f0db0ce', None)

And the error of that call is the one that I think I have been stucked with since the beginning, this one:

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "cubit.pyx", line 168, in cubit.solve_cuda
    cpdef int128 solve_cuda(int blockSize, uint64 nonce_start, uint64 update_interval, const unsigned char[:] limit, const unsigned char[:] block_bytes, const int dev_id):
TypeError: an integer is required

So I set that dev_id to 0 for my cuda device and it runs apparently, there is no error, it just returns -1. Seems that the error is because is not finding my cuda device and is setting dev_id to None? not sure why that is happening thouhg.

I can give it another go with a correct value for limit but dont know which one to set it to.

Also, I forgot to just install cubit first before bittensor, should I try that still?

juansolana commented 2 years ago

I hard coded in bittensor/utils/register_cuda.py the dev_id value to 0 where the call to solve_cuda is made. That seems to fix the issue and allow me to start the registration process.

camfairchild commented 2 years ago

It seems to be because you don't specify the flag inbtcli run, it defaults to using None for dev_id This can be remedied using btcli run --cuda --cuda.dev_id 0 and will be patched in the next release of btcli

juansolana commented 2 years ago

That makes it work without my hardcoded changes. Thanks!