int-brain-lab / iblenv

Unified environment and Issue tracker for all IBL
MIT License
10 stars 10 forks source link

[Bug report] - TypeError: a bytes-like object is required, not 'str' #305

Open dervinism opened 2 years ago

dervinism commented 2 years ago

I am trying out ONE and ran a basic test similar to number 14 described in https://int-brain-lab.github.io/ONE/notebooks/one_quickstart.html I run the following code:

from one.api import ONE
one = ONE(base_url='http://localhost:8000', username='admin', password='admin', silent=True)
eid = '9c8bda9f-2319-4697-b762-2f719deab26d'
one.list_datasets(eid)

and get this error:

$ /home/user/anaconda3/envs/iblenv/bin/python /home/user/pythonCode/reproduceTypeError.py
Failed to load the remote cache file
Traceback (most recent call last):
  File "/home/user/pythonCode/reproduceTypeError.py", line 5, in <module>
    one.list_datasets(eid)
  File "/home/user/anaconda3/envs/iblenv/lib/python3.10/site-packages/one/util.py", line 169, in wrapper
    return method(self, *args, **kwargs)
  File "/home/user/anaconda3/envs/iblenv/lib/python3.10/site-packages/one/api.py", line 1390, in list_datasets
    _, datasets = util.ses2records(self.alyx.rest('sessions', 'read', id=eid))
  File "/home/user/anaconda3/envs/iblenv/lib/python3.10/site-packages/one/util.py", line 76, in ses2records
    datasets = pd.DataFrame(records).set_index(index).sort_index()
  File "/home/user/anaconda3/envs/iblenv/lib/python3.10/site-packages/pandas/core/frame.py", line 710, in __init__
    data = list(data)
  File "/home/user/anaconda3/envs/iblenv/lib/python3.10/site-packages/one/util.py", line 65, in _to_record
    file_path = urllib.parse.urlsplit(d['data_url'], allow_fragments=False).path.strip('/')
TypeError: a bytes-like object is required, not 'str'

The computer is running Ubuntu 20.04 and iblenv was installed following instructions described in https://github.com/int-brain-lab/iblenv Can you please advice on how I could fix this issue? The method is supposed to accept strings but then it complains about that. Many thanks!

dervinism commented 2 years ago

util.py requires data_url which is None and that seems to be causing this error. How do I set data_url in the Alyx database? I don't see it under Datasets. There is one under Data Repositories but setting it does not have any effect on the data_url parameter in util.py

dervinism commented 2 years ago

Specifying the data_url directly in util.py worked. So the problem is how to set it properly in Alyx.

dervinism commented 2 years ago

However, running the following code:

object = 'spikes'
spikes = one.load_object(eid, object, attribute='amps')
spikes

gives a new error:

/home/user/anaconda3/envs/iblenv/lib/python3.10/site-packages/pandas/core/common.py:241: VisibleDeprecationWarning: Creating an ndarray from ragged nested sequences (which is a list-or-tuple of lists-or-tuples-or ndarrays with different lengths or shapes) is deprecated. If you meant to do this, you must specify 'dtype=object' when creating the ndarray.
  result = np.asarray(values, dtype=dtype)
Traceback (most recent call last):
  File "/home/user/pythonCode/alyxRetrieveFiles.py", line 23, in <module>
    spikes = one.load_object(eids, object, attribute='amps')
  File "/home/user/anaconda3/envs/iblenv/lib/python3.10/site-packages/one/util.py", line 170, in wrapper
    return method(self, *args, **kwargs)
  File "/home/user/anaconda3/envs/iblenv/lib/python3.10/site-packages/one/util.py", line 154, in wrapper
    return method(self, eid, *args, **kwargs)
  File "/home/user/anaconda3/envs/iblenv/lib/python3.10/site-packages/one/api.py", line 795, in load_object
    files = self._check_filesystem(datasets, offline=offline)
  File "/home/user/anaconda3/envs/iblenv/lib/python3.10/site-packages/one/api.py", line 407, in _check_filesystem
    self._cache['datasets'].loc[(slice(None), i), 'exists'] = not rec['exists']
  File "/home/user/anaconda3/envs/iblenv/lib/python3.10/site-packages/pandas/core/indexing.py", line 712, in __setitem__
    indexer = self._get_setitem_indexer(key)
  File "/home/user/anaconda3/envs/iblenv/lib/python3.10/site-packages/pandas/core/indexing.py", line 661, in _get_setitem_indexer
    return self._convert_tuple(key)
  File "/home/user/anaconda3/envs/iblenv/lib/python3.10/site-packages/pandas/core/indexing.py", line 799, in _convert_tuple
    idx = self._convert_to_indexer(k, axis=i)
  File "/home/user/anaconda3/envs/iblenv/lib/python3.10/site-packages/pandas/core/indexing.py", line 1291, in _convert_to_indexer
    return self._get_listlike_indexer(key, axis)[1]
  File "/home/user/anaconda3/envs/iblenv/lib/python3.10/site-packages/pandas/core/indexing.py", line 1327, in _get_listlike_indexer
    keyarr, indexer = ax._get_indexer_strict(key, axis_name)
  File "/home/user/anaconda3/envs/iblenv/lib/python3.10/site-packages/pandas/core/indexes/base.py", line 5777, in _get_indexer_strict
    indexer = self.get_indexer_for(keyarr)
  File "/home/user/anaconda3/envs/iblenv/lib/python3.10/site-packages/pandas/core/indexes/base.py", line 5764, in get_indexer_for
    return self.get_indexer(target)
  File "/home/user/anaconda3/envs/iblenv/lib/python3.10/site-packages/pandas/core/indexes/base.py", line 3784, in get_indexer
    return self._get_indexer(target, method, limit, tolerance)
  File "/home/user/anaconda3/envs/iblenv/lib/python3.10/site-packages/pandas/core/indexes/base.py", line 3809, in _get_indexer
    indexer = self._engine.get_indexer(tgt_values)
  File "pandas/_libs/index.pyx", line 305, in pandas._libs.index.IndexEngine.get_indexer
  File "pandas/_libs/hashtable_class_helper.pxi", line 5247, in pandas._libs.hashtable.PyObjectHashTable.lookup
TypeError: unhashable type: 'slice'
k1o0 commented 2 years ago

util.py requires data_url which is None and that seems to be causing this error. How do I set data_url in the Alyx database? I don't see it under Datasets. There is one under Data Repositories but setting it does not have any effect on the data_url parameter in util.py

Here's a link to our admin interface guide: https://docs.google.com/document/d/1cx3XLZiZRh3lUzhhR_p65BggEqTKpXHUDkUDagvf9Kc/edit?usp=sharing

You need to log into the admin interface and navigate to Data > Data repository and add your data repository. Make sure the data_url field is set. In the python ONE API only https URLs are supported at the moment (you can also download via AWS boto3 from S3 buckets).

Also make sure you're using the latest version of ONE:

import one
print(one.__version__)

You should be on version 1.10. You can update by running pip install -U ONE-api

dervinism commented 2 years ago

The version is 1.10 What form should the data_url field take? If I enter home/user/Data/defaultlab/Subjects, Alyx complains that it is not a valid URL. If I enter http://localhost:8000/admin/data/datarepository/01cc4148-8ee9-4f0c-b766-b194f7903e5e, it doesn't complain, but then I get data_url None in Python.

dervinism commented 2 years ago

Setting data_url to http://localhost:8000/admin/data/datarepository/repositoryName worked. However TypeError: unhashable type: 'slice' remains

dervinism commented 2 years ago

Any idea what could be the cause of this error? @k1o0

dervinism commented 2 years ago

Seems to be caused by updating the exists parameter in file api.py lines 406-407:

if update_exists:
    self._cache['datasets'].loc[(slice(None), i), 'exists'] = not rec['exists']
dervinism commented 2 years ago

I tried leaving empty the data_url parameter in the associated data repository and that would give the following error (similar to the initially reported issue):

Failed to load the remote cache file
<one.util.LazyId object at 0x7fd529a47fd0>
[{'date': '2019-01-22',
  'lab': 'defaultlab',
  'number': 1,
  'project': 'Infraslow neural activity dynamics',
  'start_time': '2019-01-22T21:30:30',
  'subject': 'M190114AMD',
  'task_protocol': '',
  'url': 'http://localhost:8000/sessions/9c8bda9f-2319-4697-b762-2f719deab26d'},
 {'date': '2019-01-22',
  'lab': 'defaultlab',
  'number': 1,
  'project': 'Infraslow neural activity dynamics',
  'start_time': '2019-01-22T19:02:51',
  'subject': 'M190114AMD3',
  'task_protocol': '3',
  'url': 'http://localhost:8000/sessions/a1109fe5-3a1a-4c6f-8f22-4d992ab9aef2'}]
Traceback (most recent call last):
  File "/home/user/pythonCode/alyxRetrieveFiles.py", line 20, in <module>
    lists = one.list_datasets(eids[1])
  File "/home/user/anaconda3/envs/iblenv/lib/python3.10/site-packages/one/util.py", line 170, in wrapper
    return method(self, *args, **kwargs)
  File "/home/user/anaconda3/envs/iblenv/lib/python3.10/site-packages/one/api.py", line 1390, in list_datasets
    _, datasets = util.ses2records(self.alyx.rest('sessions', 'read', id=eid))
  File "/home/user/anaconda3/envs/iblenv/lib/python3.10/site-packages/one/util.py", line 77, in ses2records
    datasets = pd.DataFrame(records).set_index(index).sort_index()
  File "/home/user/anaconda3/envs/iblenv/lib/python3.10/site-packages/pandas/core/frame.py", line 710, in __init__
    data = list(data)
  File "/home/user/anaconda3/envs/iblenv/lib/python3.10/site-packages/one/util.py", line 65, in _to_record
    file_path = urllib.parse.urlsplit(d['data_url'], allow_fragments=False).path.strip('/')
TypeError: a bytes-like object is required, not 'str'