rosalindfranklininstitute / DosNa

Distributed Object Store Numpy Array (DosNa)
Apache License 2.0
2 stars 1 forks source link

Python3 ceph doesn't work #6

Closed GMW99 closed 3 years ago

GMW99 commented 3 years ago

When using python3-rados all tests fail on the test_dataset.py python3 test_dataset.py --engine cpu --backend ceph --connection dosna --connection-options conffile=ceph.conf When run in python2 all tests pass. The errors are as follows:

test_dataset.py:54: DeprecationWarning: This function is deprecated. Please call randint(-10000, 10000 + 1) instead
  self.data = np.random.random_integers(DATASET_NUMBER_RANGE[0],
ERROR
test_dataset_clear (__main__.DatasetTest) ... 2021-09-01 16:59:38,600 - __main__ - INFO - DatasetTest: ceph, cpu, {'name': 'dosna', 'conffile': 'ceph.conf'}
ERROR
test_del_non_existing_dataset (__main__.DatasetTest) ... 2021-09-01 16:59:38,619 - __main__ - INFO - DatasetTest: ceph, cpu, {'name': 'dosna', 'conffile': 'ceph.conf'}
ERROR
test_existing (__main__.DatasetTest) ... 2021-09-01 16:59:38,639 - __main__ - INFO - DatasetTest: ceph, cpu, {'name': 'dosna', 'conffile': 'ceph.conf'}
ERROR
test_get_non_existing_dataset (__main__.DatasetTest) ... 2021-09-01 16:59:38,660 - __main__ - INFO - DatasetTest: ceph, cpu, {'name': 'dosna', 'conffile': 'ceph.conf'}
ERROR
test_map (__main__.DatasetTest) ... 2021-09-01 16:59:38,680 - __main__ - INFO - DatasetTest: ceph, cpu, {'name': 'dosna', 'conffile': 'ceph.conf'}
ERROR
test_number_chunks (__main__.DatasetTest) ... 2021-09-01 16:59:38,700 - __main__ - INFO - DatasetTest: ceph, cpu, {'name': 'dosna', 'conffile': 'ceph.conf'}
ERROR
test_number_chunks_slicing (__main__.DatasetTest) ... 2021-09-01 16:59:38,721 - __main__ - INFO - DatasetTest: ceph, cpu, {'name': 'dosna', 'conffile': 'ceph.conf'}
ERROR
test_sequential_set (__main__.DatasetTest) ... 2021-09-01 16:59:38,741 - __main__ - INFO - DatasetTest: ceph, cpu, {'name': 'dosna', 'conffile': 'ceph.conf'}
ERROR
test_slice_content (__main__.DatasetTest) ... 2021-09-01 16:59:38,761 - __main__ - INFO - DatasetTest: ceph, cpu, {'name': 'dosna', 'conffile': 'ceph.conf'}
ERROR

======================================================================
ERROR: test_apply (__main__.DatasetTest)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "test_dataset.py", line 57, in setUp
    self.dataset = self.connection_handle.create_dataset(
  File "/home/crc99971/github/rfi/DosNa/dosna/engines/cpu.py", line 28, in create_dataset
    dataset = self.instance.create_dataset(name, shape, dtype, fillvalue,
  File "/home/crc99971/github/rfi/DosNa/dosna/backends/ceph.py", line 79, in create_dataset
    self.ioctx.write(name, _SIGNATURE)
  File "rados.pyx", line 594, in rados.requires.wrapper.validate_func
  File "rados.pyx", line 582, in rados.requires.check_type
TypeError: data must be bytes

======================================================================
ERROR: test_dataset_clear (__main__.DatasetTest)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "test_dataset.py", line 57, in setUp
    self.dataset = self.connection_handle.create_dataset(
  File "/home/crc99971/github/rfi/DosNa/dosna/engines/cpu.py", line 28, in create_dataset
    dataset = self.instance.create_dataset(name, shape, dtype, fillvalue,
  File "/home/crc99971/github/rfi/DosNa/dosna/backends/ceph.py", line 79, in create_dataset
    self.ioctx.write(name, _SIGNATURE)
  File "rados.pyx", line 594, in rados.requires.wrapper.validate_func
  File "rados.pyx", line 582, in rados.requires.check_type
TypeError: data must be bytes

======================================================================
ERROR: test_del_non_existing_dataset (__main__.DatasetTest)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "test_dataset.py", line 57, in setUp
    self.dataset = self.connection_handle.create_dataset(
  File "/home/crc99971/github/rfi/DosNa/dosna/engines/cpu.py", line 28, in create_dataset
    dataset = self.instance.create_dataset(name, shape, dtype, fillvalue,
  File "/home/crc99971/github/rfi/DosNa/dosna/backends/ceph.py", line 79, in create_dataset
    self.ioctx.write(name, _SIGNATURE)
  File "rados.pyx", line 594, in rados.requires.wrapper.validate_func
  File "rados.pyx", line 582, in rados.requires.check_type
TypeError: data must be bytes

======================================================================
ERROR: test_existing (__main__.DatasetTest)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "test_dataset.py", line 57, in setUp
    self.dataset = self.connection_handle.create_dataset(
  File "/home/crc99971/github/rfi/DosNa/dosna/engines/cpu.py", line 28, in create_dataset
    dataset = self.instance.create_dataset(name, shape, dtype, fillvalue,
  File "/home/crc99971/github/rfi/DosNa/dosna/backends/ceph.py", line 79, in create_dataset
    self.ioctx.write(name, _SIGNATURE)
  File "rados.pyx", line 594, in rados.requires.wrapper.validate_func
  File "rados.pyx", line 582, in rados.requires.check_type
TypeError: data must be bytes

======================================================================
ERROR: test_get_non_existing_dataset (__main__.DatasetTest)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "test_dataset.py", line 57, in setUp
    self.dataset = self.connection_handle.create_dataset(
  File "/home/crc99971/github/rfi/DosNa/dosna/engines/cpu.py", line 28, in create_dataset
    dataset = self.instance.create_dataset(name, shape, dtype, fillvalue,
  File "/home/crc99971/github/rfi/DosNa/dosna/backends/ceph.py", line 79, in create_dataset
    self.ioctx.write(name, _SIGNATURE)
  File "rados.pyx", line 594, in rados.requires.wrapper.validate_func
  File "rados.pyx", line 582, in rados.requires.check_type
TypeError: data must be bytes

======================================================================
ERROR: test_map (__main__.DatasetTest)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "test_dataset.py", line 57, in setUp
    self.dataset = self.connection_handle.create_dataset(
  File "/home/crc99971/github/rfi/DosNa/dosna/engines/cpu.py", line 28, in create_dataset
    dataset = self.instance.create_dataset(name, shape, dtype, fillvalue,
  File "/home/crc99971/github/rfi/DosNa/dosna/backends/ceph.py", line 79, in create_dataset
    self.ioctx.write(name, _SIGNATURE)
  File "rados.pyx", line 594, in rados.requires.wrapper.validate_func
  File "rados.pyx", line 582, in rados.requires.check_type
TypeError: data must be bytes

======================================================================
ERROR: test_number_chunks (__main__.DatasetTest)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "test_dataset.py", line 57, in setUp
    self.dataset = self.connection_handle.create_dataset(
  File "/home/crc99971/github/rfi/DosNa/dosna/engines/cpu.py", line 28, in create_dataset
    dataset = self.instance.create_dataset(name, shape, dtype, fillvalue,
  File "/home/crc99971/github/rfi/DosNa/dosna/backends/ceph.py", line 79, in create_dataset
    self.ioctx.write(name, _SIGNATURE)
  File "rados.pyx", line 594, in rados.requires.wrapper.validate_func
  File "rados.pyx", line 582, in rados.requires.check_type
TypeError: data must be bytes

======================================================================
ERROR: test_number_chunks_slicing (__main__.DatasetTest)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "test_dataset.py", line 57, in setUp
    self.dataset = self.connection_handle.create_dataset(
  File "/home/crc99971/github/rfi/DosNa/dosna/engines/cpu.py", line 28, in create_dataset
    dataset = self.instance.create_dataset(name, shape, dtype, fillvalue,
  File "/home/crc99971/github/rfi/DosNa/dosna/backends/ceph.py", line 79, in create_dataset
    self.ioctx.write(name, _SIGNATURE)
  File "rados.pyx", line 594, in rados.requires.wrapper.validate_func
  File "rados.pyx", line 582, in rados.requires.check_type
TypeError: data must be bytes

======================================================================
ERROR: test_sequential_set (__main__.DatasetTest)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "test_dataset.py", line 57, in setUp
    self.dataset = self.connection_handle.create_dataset(
  File "/home/crc99971/github/rfi/DosNa/dosna/engines/cpu.py", line 28, in create_dataset
    dataset = self.instance.create_dataset(name, shape, dtype, fillvalue,
  File "/home/crc99971/github/rfi/DosNa/dosna/backends/ceph.py", line 79, in create_dataset
    self.ioctx.write(name, _SIGNATURE)
  File "rados.pyx", line 594, in rados.requires.wrapper.validate_func
  File "rados.pyx", line 582, in rados.requires.check_type
TypeError: data must be bytes

======================================================================
ERROR: test_slice_content (__main__.DatasetTest)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "test_dataset.py", line 57, in setUp
    self.dataset = self.connection_handle.create_dataset(
  File "/home/crc99971/github/rfi/DosNa/dosna/engines/cpu.py", line 28, in create_dataset
    dataset = self.instance.create_dataset(name, shape, dtype, fillvalue,
  File "/home/crc99971/github/rfi/DosNa/dosna/backends/ceph.py", line 79, in create_dataset
    self.ioctx.write(name, _SIGNATURE)
  File "rados.pyx", line 594, in rados.requires.wrapper.validate_func
  File "rados.pyx", line 582, in rados.requires.check_type
TypeError: data must be bytes

----------------------------------------------------------------------
Ran 10 tests in 0.238s

FAILED (errors=10)

These errors suggest that rados has changed how data is passed as is suggesting it requires a byte format.

GMW99 commented 3 years ago

So Python-Cephlibs used to require that the data be passed as a string as shown here

    def write(self, key, data, offset=0):
        """
        Write data to an object synchronously
        :param key: name of the object
        :type key: str
        :param data: data to write
        :type data: str
        :param offset: byte offset in the object to begin writing at
        :type offset: int
        :raises: :class:`TypeError`
        :raises: :class:`LogicError`
        :returns: int - 0 on success
        """
        self.require_ioctx_open()
        if not isinstance(key, str):
            raise TypeError('key must be a string')
        if not isinstance(data, str):
            raise TypeError('data must be a string')

Whereas rados python3 requires the data as bytes

    def write(self, key: str, data: bytes, offset: int = 0):
        """
        Write data to an object synchronously
        :param key: name of the object
        :param data: data to write
        :param offset: byte offset in the object to begin writing at
        :raises: :class:`TypeError`
        :raises: :class:`LogicError`
        :returns: int - 0 on success
        """
        self.require_ioctx_open()

        key_raw = cstr(key, 'key')
        cdef:
            char *_key = key_raw
            char *_data = data
            size_t length = len(data)
            uint64_t _offset = offset

        with nogil:
            ret = rados_write(self.io, _key, _data, length, _offset)
        if ret == 0:
            return ret
        elif ret < 0:
            raise make_ex(ret, "Ioctx.write(%s): failed to write %s"
                          % (self.name, key))
        else:
            raise LogicError("Ioctx.write(%s): rados_write \
returned %d, but should return zero on success." % (self.name, ret))

Therefore the change to update is to convert the data to bytes.