giesselmann / STRique

Nanopore raw signal repeat detection pipeline
MIT License
45 stars 10 forks source link

Error in index #35

Closed HLHsieh closed 1 year ago

HLHsieh commented 1 year ago

Hi there,

I followed the instructions to install STRique in a separate virtual environment, and tested everything is well.

And then, I am executing the count command as follows: python3 $script index --recursive --out_prefix ${output_folder} ${input_folder}/ > ${output_folder}/${myseq}.fofn

I got this error message as follows:

[ERROR] Failed to open /scratch/stimulated_test/C9ORF72_c1_deep_simulator_read_5x/fast5/signal_23610_2d253b4e-3df3-4f1f-8a73-0be882763c81.fast5, skip file for indexing Traceback (most recent call last): File "/home/bin/STRique/scripts/STRique.py", line 1030, in main() File "/home/bin/STRique/scripts/STRique.py", line 890, in init getattr(self, args.command)(sys.argv[2:]) File "/home/bin/STRique/scripts/STRique.py", line 899, in index for record in fast5Index.fast5Index.index(args.input, recursive=args.recursive, output_prefix=args.out_prefix, tmp_prefix=args.tmp_prefix): File "/home/venv/STR/lib/python3.8/site-packages/STRique-0.4.2-py3.8-linux-x86_64.egg/STRique_lib/fast5Index.py", line 175, in index ID = fast5Index.get_ID_single(input_file) File "/home/venv/STR/lib/python3.8/site-packages/STRique-0.4.2-py3.8-linux-x86_64.egg/STRique_lib/fast5Index.py", line 65, in get_ID_single return str(f5["/Raw/" + s.rpartition('/')[0]].attrs['read_id'], 'utf-8') TypeError: decoding str is not supported

Therefore, I tried to remove 'utf-8' on line 65 in fast5Index.py as follows return str(f5["/Raw/" + s.rpartition('/')[0]].attrs['read_id'])

Then, the problem was fixed.

Do you have some suggestions or comments on it? Do I need to remove other 'utf-8' in this file?

Thank you!

Best Regards, Hsin

giesselmann commented 1 year ago

Hi, technically that means, the 'read_id' attr from the hdf5 file is no longer read as a bytes object in python. The reason is unclear to me, could be a version thing of python itself or the hdf5 library.