giesselmann / STRique

Nanopore raw signal repeat detection pipeline
MIT License
45 stars 10 forks source link

Error STRique count #32

Closed Cesco16 closed 1 year ago

Cesco16 commented 2 years ago

Hi @giesselmann , I think your tool is very interesting and I would like to use it. I tried to use the Docker version of the tool, but I met some errors that I did not completely understand. My data are as follows: fast5_pass: directory containing 484 .fast5 files resulting from MinKnow my_sample.bam: aligned reads by minimap2 my_config.tsv: tsv file with my own regions of interest I am running it on a PC with Windows 10.

I firstly run the docker version typing the command:

docker run -it --mount type=bind,source=$(pwd),target=/host/users/lenovo/desktop giesselmann/strique

Then, i did indexing:

python3 app/scripts/STRique.py index --recursive host/users/lenovo/desktop/my_sample/fast5_pass > host/users/lenovo/desktop/my_sample/fast5_pass/reads.fofn

When I ran the counting step:

cat host/users/lenovo/desktop/my_sample/my_sample.bam | python3 app/scripts/STRique.py count host/users/lenovo/desktop/my_sample/fast5_pass/reads.fofn app/models/r9_4_450bps.model host/users/lenovo/desktop/my_config.tsv > host/users/lenovo/desktop/my_sample/result.tsv

I got the many times following error:

[PID 61] [WARNING] Factory: Unexpected error in Worker, proceeding wiht remaining reads. Traceback (most recent call last):

File "/usr/local/lib/python3.6/dist-packages/STRique-0.4.2-py3.6-linux-x86_64.egg/STRique_lib/fast5Index.py", line 81, in __get_raw__ signal = fp[os.path.join(offset, 'Raw', s)][()]

File "h5py/_objects.pyx", line 54, in h5py._objects.with_phil.wrapper

File "h5py/_objects.pyx", line 55, in h5py._objects.with_phil.wrapper

File "/usr/local/lib/python3.6/dist-packages/h5py/_hl/dataset.py", line 787, in getitem self.id.read(mspace, fspace, arr, mtype, dxpl=self._dxpl)

File "h5py/_objects.pyx", line 54, in h5py._objects.with_phil.wrapper

File "h5py/_objects.pyx", line 55, in h5py._objects.with_phil.wrapper

File "h5py/h5d.pyx", line 192, in h5py.h5d.DatasetID.read

File "h5py/_proxy.pyx", line 112, in h5py._proxy.dset_rw

OSError: Can't read data (can't open directory: /usr/local/hdf5/lib/plugin)

Followed by:

Could not retrieve bc752507-c038-4be3-bf31-93983f4a7ad6 from file host/users/lenovo/desktop/my_sample/fast5_pass/FAO49405_pass_c7cb835e_326.fast5.

And, after them, lot of:

10.11.2022 19:47:25 [PID 90] [ERROR] Detector: Error parsing alignment b734757f-310d-4599-aa25-f8fae95a25c1 4 0 0 * 0 0 TGCCTTCTAGTTTCAGTTACATCCATGCTCTATCTTCTGCTGGGATTACGGCATGACACACTTAAACATTTTCTTTATTTTTAATATGTTTCTTTCTTCTTCTTCTTCTTCTTTTTTTTTTTTTTTTTTTGTATTTTTAGTAGATATGGGTTTCACCATGTTGGCCAGGATAGTCTTGAACTCCTGACCTCAGGTGATCCACCTGCCTAGGCCTCCCAAAGTGCTGAGATTATGGGCGTGAGCTACCGCGTCCTGCCAGGAAATCCATTTTCTAAGTCTAACTTTTAAGCACTGTACCTTAATCCCTGAAGC ''('&(/)&%%'','&'++,&%$$$$%&31/0...+))3821.--(%$&*+&'(*(%%%%)4:9;887775,(&&%%&''(%$%,,3576---.1225570-4,022369@ADECDB;?55:?CGCG20@>6)&$%')-3D0///48;?5+,--0.''%&((8@?>52./+()((,18;<>AE>8445)'&&(+.../@A<95542352/-07781((()+++,54((((55:==4348846617:;=<:989/4.-,+,5))7>?7@@?<<=;54)'''''&&'()('&$$ rl:i:141

How can I deal with such errors? Thank you very much in advance!

giesselmann commented 2 years ago

Hi, looks like the vbz compression plugin is not found. There might be ways to install it on the host, but I'm not sure how to do that under Windows + Docker. I would recommend you to install the Windows Subsystem for Linux (WSL) from Microsoft and set up a basic Ubuntu for STRique (reinstall from source there). You can then follow the help/hints in #29. Sorry for the little help, I'm in a new job and can't maintain this package any longer.

Cesco16 commented 1 year ago

Hi @giesselmann ,

thank you for the suggestion! I use a WSL (Ubuntu 20.04). I did all the steps in the installation page and, before using STRique on my data, I create a conda environment with the ont_vbz_hdf_plugin. I run index and count and they did work and successfully created the output file.

But I met another problem. I tried to run STRique again, but now I have this error message:

Traceback (most recent call last): File "STRique/scripts/STRique.py", line 49, in from STRique_lib import fast5Index, pyseqan ModuleNotFoundError: No module named 'STRique_lib'

Any ideas to solve this error? Thank you very much!