hiruna72 / squigualiser

Visualise and analyse nanopore (ONT) raw signals
https://hiruna72.github.io/squigualiser/
MIT License
108 stars 2 forks source link

Problems about plot #18

Closed JeremyQuo closed 1 year ago

JeremyQuo commented 1 year ago

Thanks for your help. Recently I have been using squigualiser to check my sequencing data. But I find a bug about the nucleotide label in plotting.

Here is my command like, squigualiser plot -f final.fastq -s file.blow5 -a out.paf -o new_read -r 0eaa68b9-5989-44ca-8e00-52eba6ba3ccb --rna --region 1-100

When I changed the region to 1-200, something wrong happened. The signal is correct and extends to 200 bases, but the base label is wrong. Because the direction is from 5 to 3, so for the last 100 bases, the x-axis from 200-1 and 100-1 will be the same. Actually, the first 100 base is the same but labeled the different signals.

I guess it's because you always labeled the first x-axis with the last base in the sequence, that's correct when plotting the whole sequence but the region. Thus I think a new method to label the nucleotide is required, such as using the code like sequence[-200:-1].

hiruna72 commented 1 year ago

Hi @JeremyQuo,

Thank you reporting this bug. I fixed couple of bugs recently on dev branch and this sounds like one of them.

Can you please try the latest dev? I recommend cloning a new copy and using conda to create an env with python3.9 (instructions are available on readme dev).

JeremyQuo commented 1 year ago

Thanks for your information. I tried the last version on the dev branch. It improved the previous bug when I set the region with 1-100 or 1-200. But I find another bug when I set the region that covers the last part of my read.

For example, my reads query length is 10000 and the length of the move table is 9996, which means the first 4 will be dropped and the signal will only cover 9996 bases. So when I set 1-100, it will return 100-5 and it's correct.

When I set 9950-10000, it should return 10000-9950. And the first base should be the last base in my read. But the title showed it return to 9996-9950 and the first base is not the last base in my read its index is -4 not -1.

I don't know whether my understanding is correct or if there is any problem with the Squigualiser plot.

Thanks

JeremyQuo commented 1 year ago

Or I can say if it starts from 1+4, it should end in 9996+4

hiruna72 commented 1 year ago

Hi @JeremyQuo,

I think I get what you are saying. I however, cannot reproduce the bug with my test data. Can you please send me a minimal dataset and the commands to reproduce this bug? I really appreciate your support.

JeremyQuo commented 1 year ago

Sorry for replying late. Here is my test data and it should be can cover 6-620 but things seems not that correct. When I set region 500-650,it can only cover 500-615, When I want to set region 1-100, and bug appears. squigualiser plot -f final.fastq -s file.blow5 -a file.paf -o result --rna --region 1-300 sequence file: final.fastq alignment file: file.paf signal file: file.blow5 Info: Signal to read method using PAF ... plot region: 56-300 read_id: 562eeb47-2b86-4fc7-abfc-5dce62f511ed BokehUserWarning: ColumnDataSource's columns must be of the same length. Current lengths: ('line_color', 8315), ('x', 8000), ('y', 8000) BokehUserWarning: ColumnDataSource's columns must be of the same length. Current lengths: ('fill_color', 8315), ('line_color', 8315), ('x', 8000), ('y', 8000) BokehUserWarning: ColumnDataSource's columns must be of the same length. Current lengths: ('fill_color', 8315), ('hatch_color', 8315), ('line_color', 8315), ('x', 8000), ('y', 8000) plot region: 1-300 read_id: 562eeb47-2b86-4fc7-abfc-5dce62f511ed Traceback (most recent call last): File "/home/zhguo/Program/anaconda3/envs/venv3/bin/squigualiser", line 33, in sys.exit(load_entry_point('squigualiser==0.0.1', 'console_scripts', 'squigualiser')()) File "/home/zhguo/Program/anaconda3/envs/venv3/lib/python3.9/site-packages/squigualiser-0.0.1-py3.9.egg/src/init.py", line 30, in main File "/home/zhguo/Program/anaconda3/envs/venv3/lib/python3.9/site-packages/squigualiser-0.0.1-py3.9.egg/src/plot.py", line 853, in run File "/home/zhguo/Program/anaconda3/envs/venv3/lib/python3.9/site-packages/squigualiser-0.0.1-py3.9.egg/src/plot.py", line 160, in plot_function IndexError: index 9144 is out of bounds for axis 0 with size 8000

And I don't know why. Many thanks if can help. test_data.zip

JeremyQuo commented 1 year ago

I supposed the index problem is from the region option, which will be different logic with no region option. And the bug I met is because the insert, there should be some problem when tackle the related problem about insertion

hiruna72 commented 1 year ago

@JeremyQuo,

Thanks a lot. I will get back to you asap.

hiruna72 commented 1 year ago

Hi @JeremyQuo,

Thank you for reporting this bug. I have updated reform.py to support different stride values. Hopefully, this should have resolved the issue. I tested with your dataset as well. Please use the latest dev commit.

hiruna72 commented 1 year ago

Hi @JeremyQuo,

Could you please let me know if this issue is resolved? I will close the issues for now. Feel free to reopen.