hiruna72 / squigualiser

Visualise and analyse nanopore (ONT) raw signals
https://hiruna72.github.io/squigualiser/
MIT License
108 stars 2 forks source link

Squigualiser will not accept slow5 file created by squigualator #52

Closed jsthv closed 8 months ago

jsthv commented 9 months ago

Squigualiser would not accept slow5 file for an ideal signal produced by the current version of squigualator:

'. At src/slow5.c:732R] Malformed slow5 header. Bad minor version '0 [slow5_init::ERROR] Parsing slow5 header of file 'BC1_54mer_ideal_signal_290mer.slow5' failed. At src/slow5.c:153 ./squigualiser: line 35: 2386 Segmentation fault squigualiser "$@"

This is the header to the file in question.

slow5_version 0.2.0

num_read_groups 1

@asic_id asic_id_0 @exp_start_time 2022-07-20T00:00:00Z @experiment_type genomic_dna @flow_cell_id FAN00000 @run_id run_0 @sequencing_kit sqk-lsk114

char uint32_t double double double double uint64_t int16_t char* double int32_t uint8_t uint64_t

read_id read_group digitisation offset range sampling_rate len_raw_signal raw_signal channel_number median_before read_number start_mux start_time

S1_1!Barcode!0!278!+ 0 8192 13.380569 1536.598389 4000 2700

hasindu2008 commented 9 months ago

That's strange. Could you copy paste the commandline you used on squigulator so I can try?

hasindu2008 commented 9 months ago

Also could you please run slow5tools quickcheck on the input slow5 file so we can see if something is wrong with the slow5 file?

If it is not too big, you can share the file with us to investigate.

jsthv commented 9 months ago

Hasindu,

Thanks for looking into this. I have attached all the files I used for this run. The slow5 file was produced with -ideal and -full-contigs. I reran everything, and now get the following errors:

@.***:~/squigualiser$ ./squigualiser plot -f BC1_290mer.fa -s BC1_290mer_out.slow5 -a BC1_290mer_align.sam -o /home/taylor/squigualiser --tag_name "BC1" sequence file: BC1_290mer.fa alignment file: BC1_290mer_align.sam signal file: BC1_290mer_out.slow5 Info: Signal to reference method using SAM/BAM ... Traceback (most recent call last): File "/home/taylor/squigualiser/bin/squigualiser", line 8, in sys.exit(main()) File "/home/taylor/squigualiser/lib/python3.8/site-packages/src/init.py", line 56, in main args.func(args) File "/home/taylor/squigualiser/lib/python3.8/site-packages/src/plot.py", line 784, in run if args_ref_start < reference_start + 1: TypeError: '<' not supported between instances of 'NoneType' and 'int'

From: Hasindu Gamaarachchi @.> Sent: Thursday, January 04, 2024 6:55 PM To: hiruna72/squigualiser @.> Cc: Taylor, John-Stephen @.>; Author @.> Subject: Re: [hiruna72/squigualiser] Squigualiser will not accept slow5 file created by squigualator (Issue #52)

That's strange. Could you copy paste the commandline you used on squigulator so I can try?

- Reply to this email directly, view it on GitHubhttps://github.com/hiruna72/squigualiser/issues/52#issuecomment-1877966408, or unsubscribehttps://github.com/notifications/unsubscribe-auth/BEVIT5KQBHVIPGLKYX6ZLI3YM5FPVAVCNFSM6AAAAABBNXAQZCVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTQNZXHE3DMNBQHA. You are receiving this because you authored the thread.Message ID: @.**@.>>

jsthv commented 9 months ago

Hasindu,

Sorry to taking up your time. I have gotten your program to work. I finally more carefully read the instructions and realized that I had included a sam alignment file and not a paf file (I only paid attention to the -a ). Also I did not include a read file that I now produced with squigulator. There still seems to be an error concerning an index file (see below) that might be related to an issue relating to assignment. See attached pdf which shows that the alignment of the nucleotides based on the files created from squigualator appear to be 6 nucleotides off. I used the read file created by squigualator. The sequence above is the one that I am working with along with another sequence containing a 6-mer deletion of AATTAG. I used squigulator to simulate both ideal signals and squigualiser to align the sequence with the signal. What I find, however, is that the alignment created by squigualiser is off by 6 nucleotides (I also found this to be true simply reading the sam file and comparing it to the signal). The part of the signal that disappears is not the one assigned to AATTAG. Somehow an offset of 6 nucleotides has not been incorporated. How in the future do I get squigualizer to align the signal properly with the sequence? If there something that I am missing? Sorry, I am new to all this. I am working with DNA photomodifications and trying to understand how they affect nanopore signals.

sequence file: BC1_278mer_read.fa alignment file: BC1_278mer_align.paf signal file: BC1_278mer_out.slow5 Info: Signal to read method using PAF ... plot region: 1-278 read_id: S1_1!Barcode!0!278!+ [slow5_idx_init::INFO] Index file not found. Creating an index at 'BC1_278mer_out.slow5.idx'. output file: /home/taylor/squigualiser/S1_1!Barcode!0!278!+_BC1minus6.html

John-Stephen Taylor @.**@.> Professor of Chemistry Web: chemistry.wustl.edu/people/john-stephen-taylorhttps://chemistry.wustl.edu/people/john-stephen-taylor Department of Chemistry Dept. website: chemistry.wustl.edu/https://chemistry.wustl.edu/ Washington University (314) 935-6721 Voice Campus Box 1134 (314) 935-4481 FAX St. Louis, MO 63130 (314) 935-6530 Chemistry Office

From: Hasindu Gamaarachchi @.> Sent: Thursday, January 04, 2024 7:04 PM To: hiruna72/squigualiser @.> Cc: Taylor, John-Stephen @.>; Author @.> Subject: Re: [hiruna72/squigualiser] Squigualiser will not accept slow5 file created by squigualator (Issue #52)

Also could you please run slow5tools quickcheck on the input slow5 file so we can see if something is wrong with the slow5 file?

If it is not too big, you can share the file with us to investigate.

- Reply to this email directly, view it on GitHubhttps://github.com/hiruna72/squigualiser/issues/52#issuecomment-1877971594, or unsubscribehttps://github.com/notifications/unsubscribe-auth/BEVIT5JGJS6CIOAWJINHNKTYM5GOLAVCNFSM6AAAAABBNXAQZCVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTQNZXHE3TCNJZGQ. You are receiving this because you authored the thread.Message ID: @.**@.>>

hiruna72 commented 9 months ago

Hello @jsthv,

  1. What is the squigualiser --version you are using? The error regarding 'NoneType' and 'int' was fixed in the latest release. This happened when the user did not provide a region. Could you please check if you have the latest version (v0.5.1) installed?
  2. Slow5 index file missing is a warning. It will create an index when it is not found. Hence, can be ignored.

I guess since you replied to the email, the file is not attached. Could you reply on github and attach the two html files (with and without deletion)? If you can attach the two .fa files I can produce the results from my end as well.

jsthv commented 9 months ago

Squigulator V. 0.2.2 Squigualiser 0.3.0. I think these were the precompiled binaries avalable on Github which were downloaded recently. I have attached the two fasta files in question as a zipfile. A 290mer and a 278mer. The 290mer is a dimer of a 145mer containing the AATTAG sequence. squigulator.zip

hiruna72 commented 9 months ago

Hello @jsthv,

Thanks for the files.

There are couple of ways to achieve the objective.

The easiest is to add Ns to fill the deletions in the 278 fasta file and generate the plots.

The best method is to use a .vcf file and generate a reference. Then use that reference with squigulator to simulate reads with variants (https://github.com/hasindu2008/squigulator#examples).

I wrote a script to perform the first method. The script and the plots can be found in the attached zip file. jsthv.zip

Let me know if there are many such sequences you are anaylsing. Then I can write a proper script.

image

hiruna72 commented 8 months ago

Hi @jsthv,

Hope your issues are resolved. Please feel free to reopen the issue should you need further help.