sihaohuanguc / NanoSPA

GNU General Public License v2.0
11 stars 1 forks source link

No results after running alignment step #2

Open Dongxu-Zheng opened 4 months ago

Dongxu-Zheng commented 4 months ago

Hi, I am working on the m6a identification on my ONT data. I installed NanoSPA according to the instructions. I can run the test data successfully and get the output files in the plus_strand folder. However, I can't get any results from my own data. The task was just gone without any ERROR report. The fastq files were basecalled by Guppy in fastq.gz format. I also cat them into one fastq.gz file. But still it didn't work.

I use human genome downloaded from GENCODE. The version of software I used are in below: Name: tensorflow Version: 2.12.0 samtools 1.3.1 minimap2 2.28-r1209

Here is the log file when I submit my task:

2024-04-20 15:58:29.005887: I tensorflow/core/platform/cpu_feature_guard.cc:182] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations. To enable the following instructions: SSE4.1 SSE4.2 AVX AVX2 FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags. [M::mm_idx_gen::70.4731.87] collected minimizers [M::mm_idx_gen::99.8552.20] sorted minimizers [M::main::99.8552.20] loaded/built the index for 194 target sequence(s) [M::mm_mapopt_update::101.2512.18] mid_occ = 2053 [M::mm_idx_stat] kmer size: 14; skip: 5; is_hpc: 0; #seq: 194 [M::mm_idx_stat::101.983*2.17] distinct minimizers: 65621209 (23.06% are singletons); average occurrences: 15.289; average spacing: 3.090; total length: 3099750718 [M::main] Version: 2.28-r1209 [M::main] CMD: minimap2 -ax splice -uf -k14 ./reference/GRCh38.primary_assembly.genome.fa ./test/temp/all_pass_fastq.fastq [M::main] Real time: 102.263 sec; CPU: 221.678 sec; Peak RSS: 18.438 GB

Thanks for any help and suggestions.

Cheers, Dongxu

eltonjrv commented 4 months ago

Hello there! I'm facing the same issue with "nanospa alignment", no error messages rinted out nor alignment results generated. Hope anyone can shed a light soon! Thanks

eltonjrv commented 4 months ago

Oops! My bad! Just realized that files must NOT be gzipped. It worked after I gunzipped the fastq files. Cheers, Elton

Dongxu-Zheng commented 1 month ago

Oops! My bad! Just realized that files must NOT be gzipped. It worked after I gunzipped the fastq files. Cheers, Elton

Hi Elton, it works for me as well. Thanks for sharing it with me. By the way, I was wondering if you aligned your fastq reads to a genome or a transcriptome?

Dongxu

eltonjrv commented 1 month ago

Hi Dongxu, I aligned against ref genome. Best, Elton

Dongxu-Zheng commented 1 month ago

Hi Dongxu, I aligned against ref genome. Best, Elton

Thanks for your quick reply. I did transcriptome and genome both. The results of the transcriptome showed more m6A sites compared to the results of the genome. Also, if I aligned the read to genome, I lost the information about which transcript annotation since NanoSpa didn't require any annotation file as input. Perhaps, I need to open a new thread to discuss this.

Cheers, Dongxu

eltonjrv commented 1 month ago

You can use bedtools intersect (https://bedtools.readthedocs.io/en/latest/content/tools/intersect.html) to identify the transcript which the m6A position on the genome belongs to. Best, Elton

Dongxu-Zheng commented 1 month ago

You can use bedtools intersect (https://bedtools.readthedocs.io/en/latest/content/tools/intersect.html) to identify the transcript which the m6A position on the genome belongs to. Best, Elton

Thanks. I want to have the isoform level m6A modifications. If I mapped the reads to genome, I only know which locus with this modification. I will use the transcriptome and compare the results from nanpspa and m6anet. Anyway, many thanks for your reply!

Cheers, Dongxu