GoekeLab / xpore

Identification of differential RNA modifications from nanopore direct RNA sequencing
https://xpore.readthedocs.io/
MIT License
131 stars 23 forks source link

data problems data.json #141

Closed Tako-liu closed 2 years ago

Tako-liu commented 2 years ago

Hi! Thanks for such an excellent software. But I have some doubts on the dataprep of Arabidopsis, maybe they are not considered problems. The generated data.json file shows this data. {"AT1G64460.1":{"21":{"ACTTT":[NaN,NaN]},"22":{"CTTTG":[NaN,NaN]},"23":{"TTTGC":[NaN,NaN]},"24":{"TTGCC":[NaN,NaN]},"25":{"TGCCT":[NaN]},"26":{"GCCTT":[NaN,NaN]},"27":{"CCTTT":[NaN,NaN]},"28":{"CTTTC":[NaN,NaN]},"29":{"TTTCT":[NaN,NaN]} And the whole document is like this. I would like to know if this is correct and if it will have an impact on the next analysis. Here are the commands I used. gtf:Arabidopsis_thaliana.TAIR10.52.gtf fasta:Arabidopsis_thaliana.TAIR10.cdna.all.fa xpore dataprep --eventalign reads-ref.eventalign.txt --out_dir arabidopsis_transcriptome_dataprep --n_processes 32 nanopolish eventalign --reads reads.fasta --bam reads-ref.sorted.bam --genome /Arabidopsis_thaliana.TAIR10.cdna.all.fa --scale-events > reads-ref.eventalign.txt Here are the first ten lines of this file, reads-ref.mentalign.txt contig position reference_kmer read_index strand event_index event_level_mean event_stdv event_length model_kmer model_mean model_stdv standardized_level AT5G26800.1 54 CCGAG 0 t 3 94.57 3.734 0.00432 CCGAG 95.86 3.99 -0.27 AT5G26800.1 54 CCGAG 0 t 4 92.50 2.848 0.00266 CCGAG 95.86 3.99 -0.71 AT5G26800.1 54 CCGAG 0 t 5 96.28 3.657 0.00498 CCGAG 95.86 3.99 0.09 AT5G26800.1 55 CGAGG 0 t 6 111.03 9.270 0.00730 CGAGG 111.04 5.69 -0.00 AT5G26800.1 56 GAGGA 0 t 7 121.67 5.279 0.00232 GAGGA 112.67 7.84 0.97 AT5G26800.1 57 AGGAA 0 t 8 109.18 7.980 0.00332 AGGAA 117.46 3.17 -2.22 AT5G26800.1 58 GGAAA 0 t 9 121.25 8.279 0.01096 GGAAA 121.47 5.56 -0.03 AT5G26800.1 59 GAAAA 0 t 10 106.17 1.483 0.00465 GAAAA 106.42 2.68 -0.08 AT5G26800.1 60 AAAAG 0 t 11 103.29 2.825 0.00531 AAAAG 101.72 2.68 0.50 I hope to get your answer and thank you for answering my question in your busy schedule.

yuukiiwa commented 2 years ago

Hi @Tako-liu,

You will have to rerun nanopolish eventalign with --signal-index included. Here is the example command from xpore's documentations:

nanopolish eventalign --reads <PATH/TO/FASTQ_FILE> \
--bam <PATH/TO/BAM_FILE> \
--genome <PATH/TO/FASTA_FILE> \
--signal-index \
--scale-events \
--summary <PATH/TO/summary.txt> \
--threads 32 > <PATH/TO/eventalign.txt>

Thanks!

Best wishes, Yuk Kei

Tako-liu commented 2 years ago

Hi @yuukiiwa,

Your advice works! Thanks for the help!