HDF5 - Githubissues

Jannatul-Ferdous-Srabonee commented 5 years ago

I am trying to do model training with hdf5 . I am using python 3.6. My code is :

def hdf5_handler(filename, mode="r"):
    h5py.File(filename, "a").close()
    propfaid = h5py.h5p.create(h5py.h5p.FILE_ACCESS)
    settings = list(propfaid.get_cache())
    settings[1] = 0
    settings[2] = 0
    propfaid.set_cache(*settings)
    with contextlib.closing(h5py.h5f.open(filename, fapl=propfaid)) as fid:
        return h5py.File(fid, mode)

the function call : hdf5 = hdf5_handler("./data/abide.hdf5".encode('utf-8'), "a".encode('utf-8')) I get the following error: Traceback (most recent call last): File "prepare_data.py", line 139, in prepare_folds(hdf5, folds, pheno, derivatives, experiment="{derivative}_whole") File "prepare_data.py", line 83, in prepare_folds fold["train"] = ids[train_index].tolist() File "C:\Users\jfsra\AppData\Local\Programs\Python\Python36\lib\site-packages\h5py_hl\group.py", line 385, in setitem ds = self.create_dataset(None, data=obj, dtype=base.guess_dtype(obj)) File "C:\Users\jfsra\AppData\Local\Programs\Python\Python36\lib\site-packages\h5py_hl\group.py", line 136, in create_dataset dsid = dataset.make_new_dset(self, shape, dtype, data, **kwds) File "C:\Users\jfsra\AppData\Local\Programs\Python\Python36\lib\site-packages\h5py_hl\dataset.py", line 118, in make_new_dset tid = h5t.py_create(dtype, logical=1) File "h5py\h5t.pyx", line 1630, in h5py.h5t.py_create File "h5py\h5t.pyx", line 1652, in h5py.h5t.py_create File "h5py\h5t.pyx", line 1713, in h5py.h5t.py_create TypeError: No conversion path for dtype: dtype('<U16')

meneguzzi commented 5 years ago

I reckon we need a lot more context here.

Did you download the dataset?
This code was originally developed in python 2.7, are you sure there are no other problems with other libraries we use to load data?

Jannatul-Ferdous-Srabonee commented 5 years ago

Yes, i have downloaded the Data set and also made some changes for python version 3.6.

On Wed, 10 Apr 2019, 10:35 pm Felipe Meneguzzi, notifications@github.com wrote:

I reckon we need a lot more context here.

Did you download the dataset?

This code was originally developed in python 2.7, are you sure there are no other problems with other libraries we use to load data?

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/lsa-pucrs/acerta-abide/issues/8#issuecomment-481766180, or mute the thread https://github.com/notifications/unsubscribe-auth/Ad5vC2geA2C5_mpSBZvs_bXyfNUBXdA1ks5vfhLKgaJpZM4cmcfi .

meneguzzi commented 5 years ago

This seems like an issue with the encoding at some point, but I'd need to debug your code to see if it works. Are you sure this is not an issue with the other libraries?

ShekharDewan commented 5 years ago

Running into a similar issue. What Jannatul seems to have done is to change the encoding of the file in the read function, and I am wondering if this will work with a different modification. :

(tf-gpu) E:\ABIDE\Acerta-abide\acerta-abide>python prepare_data.py \

Traceback (most recent call last): File "prepare_data.py", line 128, in hdf5 = hdf5_handler("./data/abide.hdf5", "a") File "E:\ABIDE\Acerta-abide\acerta-abide\utils.py", line 50, in hdf5_handler with contextlib.closing(h5py.h5f.open(filename, fapl=propfaid)) as fid: File "h5py_objects.pyx", line 54, in h5py._objects.with_phil.wrapper File "h5py_objects.pyx", line 55, in h5py._objects.with_phil.wrapper File "h5py\h5f.pyx", line 72, in h5py.h5f.open TypeError: expected bytes, str found

ShekharDewan commented 5 years ago

I was able to fix the above issue with the following modification:

Original: hdf5 = hdf5_handler("./data/abide.hdf5", "a")

Modified:

hdf5 = hdf5_handler( bytes("./data/abide.hdf5", encoding = 'utf-8'), "a")

But this resulted in the following ocurring, which is very similar to what the author has posted:

(tf-gpu) E:\ABIDE\Acerta-abide\acerta-abide>python prepare_data.py --whole --male --threshold --folds 10 cc200

Preparing whole dataset Traceback (most recent call last): File "prepare_data.py", line 139, in prepare_folds(hdf5, folds, pheno, derivatives, experiment="{derivative}_whole") File "prepare_data.py", line 83, in prepare_folds fold["train"] = ids[train_index].tolist() File "C:\ProgramData\Anaconda3\envs\tf-gpu\lib\site-packages\h5py_hl\group.py", line 385, in setitem ds = self.create_dataset(None, data=obj, dtype=base.guess_dtype(obj)) File "C:\ProgramData\Anaconda3\envs\tf-gpu\lib\site-packages\h5py_hl\group.py", line 136, in create_dataset dsid = dataset.make_new_dset(self, shape, dtype, data, **kwds) File "C:\ProgramData\Anaconda3\envs\tf-gpu\lib\site-packages\h5py_hl\dataset.py", line 118, in make_new_dset tid = h5t.py_create(dtype, logical=1) File "h5py\h5t.pyx", line 1630, in h5py.h5t.py_create File "h5py\h5t.pyx", line 1652, in h5py.h5t.py_create File "h5py\h5t.pyx", line 1713, in h5py.h5t.py_create TypeError: No conversion path for dtype: dtype('<U16')

ShekharDewan commented 5 years ago

@meneguzzi Tagging you in case github is not notifying correctly.

meneguzzi commented 4 years ago

Sorry about the long time to respond, I've been extremely busy with the teaching semester. So, your change did not improve running the algorithm, right? If you did solve the issue, can you try doing a pull request?

Siddharth-Shrivastava7 commented 4 years ago

In 'prepare_data.py' file change the prepare_folds function by:

            fold['train'] = [ind.encode('utf8') for ind in ids[train_index]] 

            fold['valid'] = [indv.encode('utf8') for indv in ids[valid_index]]

            fold["test"] = [indt.encode('utf8') for indt in ids[test_index]]

It works !

lsa-pucrs / acerta-abide

HDF5 #8