novoalab / EpiNano

Detection of RNA modifications from Oxford Nanopore direct RNA sequencing reads (Liu*, Begik* et al., Nature Comm 2019)
GNU General Public License v2.0
109 stars 31 forks source link

Epinano Variants result #104

Closed yuxinPenny closed 2 years ago

yuxinPenny commented 2 years ago

The #Ref number from the Epinano Variants results seems weird

(base) yuxin@bio-sg-1:~$ head /home/share/yuxin/2021Fall/DATA/hm_tmp/HMEC_WT/bam/HMEC_WT_g.minus_strand.per.site.csv

Ref,pos,base,strand,cov,q_mean,q_median,q_std,mis,ins,del

0,24738,C,-,1,11.00000,11.00000,0.00000,1.00000,0.00000,0.00000 0,24739,T,-,1,11.00000,11.00000,0.00000,1.00000,0.00000,0.00000 0,24740,G,-,1,12.00000,12.00000,0.00000,1.00000,0.00000,0.00000 0,24741,C,-,1,6.00000,6.00000,0.00000,1.00000,0.00000,0.00000 0,24742,T,-,1,6.00000,6.00000,0.00000,1.00000,0.00000,0.00000 0,24743,G,-,1,6.00000,6.00000,0.00000,1.00000,0.00000,0.00000 0,24744,A,-,1,12.00000,12.00000,0.00000,1.00000,0.00000,0.00000 0,24745,A,-,1,0.00000,0.00000,0.00000,0.00000,0.00000,1.00000 0,24746,G,-,1,0.00000,0.00000,0.00000,0.00000,0.00000,1.00000 (base) yuxin@bio-sg-1:~$ tail /home/share/yuxin/2021Fall/DATA/hm_tmp/HMEC_WT/bam/HMEC_WT_g.minus_strand.per.site.csv 60,134253345,G,-,4,12.00000,13.00000,4.58258,1.00000,0.00000,0.00000 60,134253346,G,-,4,14.75000,16.50000,7.29298,1.00000,0.00000,0.00000 60,134253347,C,-,4,8.75000,8.50000,3.03109,1.00000,0.00000,0.00000 60,134253348,T,-,4,7.75000,9.00000,3.69966,1.00000,0.00000,0.00000 60,134253349,G,-,4,7.25000,7.00000,1.78536,1.00000,0.00000,0.00000 60,134253350,T,-,4,6.00000,6.50000,1.87083,1.00000,0.00000,0.00000 60,134253351,G,-,2,5.00000,5.00000,0.00000,1.00000,0.00000,0.00000 60,134253352,C,-,1,6.00000,6.00000,0.00000,1.00000,0.00000,0.00000 60,134253353,A,-,1,4.00000,4.00000,0.00000,1.00000,0.00000,0.00000 60,134253354,G,-,1,5.00000,5.00000,0.00000,1.00000,0.00000,0.00000

The reference file I used seems normal: (base) yuxin@bio-sg-1:~$ grep 'chr' /home/share/yuxin/2021Fall/DATA/hm_tmp/hg38.fa

chr1 chr10 chr11 chr11_KI270721v1_random chr12 chr13 chr14 chr14_GL000009v2_random chr14_GL000225v1_random chr14_KI270722v1_random chr14_GL000194v1_random chr14_KI270723v1_random chr14_KI270724v1_random chr14_KI270725v1_random chr14_KI270726v1_random chr15

Can you give any clues?

Huanle commented 2 years ago

Hi @yuxinPenny , can you check the reference IDs in you sam/bam file to confirm those wired ones are not there?

yuxinPenny commented 2 years ago

Yes, I am sure, this is part of my bam file, I also tested another software (nanoRMS, which used an edited version of epinano), and the result is normal.

6b5b9582-dd83-40c8-99b8-30899b07e97b 16 chr1 14403 1 69S16M3D5M3D23M1D4M1D4M3D25M5D7M3D5M1D6M1D4M1D16M1D11M6D7M1I29M2D4M1D6M1D24M1I10M1I14M1D14M3D6M3D3M2D2M1I2M1D6M1D9M1D14M1I9M1D15M1I18M1I6M2D5M1D16M3D2M1I14M1I16M939S * 0 0 CATGGGAATTGAGGATGTAGGGAGGATGGTGGGTTGAATATGATGTTATAGGGTATGGGATGGGAATTTGTTTCTGCTCAGTTCTTTGATTGAGCCGTTTTCTCTGGAAGCCTTTAAAGCATGGCGCAAGCTGGGTGGAGCCGTCCTTGAGCAGCATAAACAGTCCCGCCCAGCTGTGTGGCCTTAGCCAGCCTTCTTCAAGACCGGTCTCCACACAGTGCTGGTTCCGTCACCCTCCAAGGAGTAGGTCTGAGCAGCTTGTCCTGTGCCGGAGCCAGTGTCAGAGCAACGGCCAAGTCTGGGTCTGGGGAAGTCGGTAGCTCCTAGATTCCCAGCGTCCTCGTCCTCCTTCTGCCTGTGCCGCTGCGGTGGCGAACGGTGGTGGGATGAAGTCCGGTCACGGGCAAGGCTCCTCCGGGCCCAGCCAGCCCAAGGTCCTTTTCCCAGTGATGCCTTGCGCTCGACCAGCTTGTTGATCCGGTCAAGCCACCAGTGCCGGCTCTCACCAGCTCCTGCTCCACTTCTTCTTCTCCAGCTACCTATGCTGCGCAGCTGCTGGCCTTGCGCCGATGCCCCAGCTTGTGGCGGATGGACTCCAGTAGCGAGAGTGGCCCAGGCCACCGTGGGGTCGCGCCACTTCCCTGGGAGCTCCTGACTGAAAGGTACGCGCTGCTGCTGCTGTCGTCCCGCCTGCGGCGCCTTGGCCCGATTTGCGAGGGCCGCGGTGGTTGAAGGTGGGAGTAGGGGTGCACTGGCCGCTGAGATCGGGTGTGAAGGGGCGGTGGGGGAAGGATGTTACCCATCTTGAGTGGTCTTGAGAGGCTCGGCCAGCTTCAGTGTGGGCAGTTCCGGAATGGGCTGGACGGGATTGCTGGGCCCAGGTCGGCAATGACATGAAAGGTCGTTGGCAATGCCGGACAGGTCAGGGCAGGTAGGATGGAACATCAATCTCAGGCACCTGGCCCAGGTCTGGCACGTTGTAGTTCTCTCTCTGGGACCTACAGTTCCAGCTCTCTCTTGCTGATGGTGCAAGGGATCGTCAACAGCTTCTCTGTCGGTCTCTCCGCCCAGCATCGGCGGGTCTTTGTTACAGCACCAGTCAGGGGTCCAGGAAGACATACTTCTTCTTCAGGTTCTGTGATAAAGAGCAACAGAAATTTGCGCCAGAGCTGATAGCTGGGAAGACCCCAAAGTCCTCTTCTGCATCGTCAGTGGCTCTGCTCGGTGCTCAACGCACAGCGGTGGTCCTCAGCCAGCTTCTCCTGCAGGCCTGCCCGTCCAGGCGGTGCTTGCTCTGGATCCTGTGCGGGGGCGTCTTTCTGCAGGCCAGGGTCCTGGGCACCCGGTAAGGAGCCAATTCCTGCAGGCGCCTGGAGCAGGGTACTTGGCACTGA ###/8(,0/$''##'&$/)&&%&%&/$%%$%%$'%&$$%&&%)(%%$('$#$$)-$%'0/'.3()66+)'(()-.//+.,(%&)%%%-0,699)',+),,(+(+,('$((&$&%$(+)))##')'-0-$&(+3,+)0')&(&(')%#$%'-,(&&23'(,.,+,/.+,,).-/)()''1&(-(+,&&&'%%%&/(,++-,12.++)%$-..9443+')-/0$'),)&&((+1.''&%#3--(3'&2)))%,%-)%%+''&&+21&104583.)1+(((+)+()*5:-%%%%''%'&$')#,%',)..+2/-'+++-/),)+-+(''&&(%&+%'$)%))()&''%('''&&%)+'+(-/,$#%%'%'+&','(%+45()&('((--)01,,(.0+&0&'(44+)%1,.,.8-%+($$$$&%1+/-)'&'%('&.2/-,)/(-+)/(.)'&(&%$(+/.16+&/07/-+(-).($()&-$%'&)&#%)(,-.++)(-2/0,'%+*$$%(.+--,,,,-&)))6-3+,&%$&(20.('46,,++('$%$(/-31./4,)(&'%%%#%&&'(00112.($+(0,+520&11/(+)()',.'*,)27)-2;/,(-0"&()/'(&$'%#'''()(-,++--,',)',.,&&')$$%&))+2,230709,$%&&&%$',5+$$%-,(('((%%',&1/-&''%/1-)()+)-%(%#(,&+''/.("%#$.+4<<,.02)+690$$%($%#&$'49.,.--,'(&%'2%%)$'++)''()'('&()''''&($,/))(--+++,'&''3,++0%''1-$&'$%'/6(+,'%)&&&(''(&$'/))'%&%%)&&,(,(,')'%&),/0).//63.+*)5+1)&%,,),,/-.+1.-3(()+,02.-01(,),.,+(-(((1(')(-*)&))/.*'()+0+37,(($#$%'.0%$"#&))'''%)'&('.%'(,+260&$(&'&)),..1/+))(((('&%''()+-'37-3-2+(%##&+++28,+-,/32440+(%%&$3:8(.:.20)&%(6-,++--+)+'$,-.282-(&$%''(('./+&&'$))))%#%#'%&$'&'(+++,,3:+,&%%1.(0'))0-+++)($+-//.+)(&%,+(((&&&()-2..,+(''(&)(&&%$)'$(&&'&&&&((','+&'&-)()(/)/)')&')(()-,'1,001/./.,6,)-0,('('69=.++)'))&)+,..-%'41).-)0<3-+03&%$#'%0((,&&(.-+())-16.,/&#')&.(,4))&&(&)()((&'& NM:i:86 ms:i:342 AS:i:342 nn:i:0 tp:A:P cm:i:9 s1:i:83 s2:i:82 de:f:0.1456 SA:Z:chr1,187380,-,813S187M13D394S,6,35;chr2,113595377,+,1S132M8D1261S,1,17; rl:i:0

Huanle commented 2 years ago

can you provide the relevant epinano command? @yuxinPenny thanks

yuxinPenny commented 2 years ago

java -jar /home/share/yuxin/miniconda3/envs/bioinfo/share/picard-2.23.8-0/picard.jar \ CreateSequenceDictionary -R /home/share/yuxin/2021Fall/DATA/hm_tmp/hg38.fa \ -O /home/share/yuxin/2021Fall/DATA/hm_tmp/hg38.dict

python3 /home/share/yuxin/EpiNano/Epinano_Variants.py \ -n 10 -R /home/share/yuxin/2021Fall/DATA/hm_tmp/hg38.fa \ -b /home/share/yuxin/2021Fall/DATA/hm_tmp/HMEC_WT/bam/HMEC_WT_g.bam \ -s /home/share/yuxin/jvarkit/dist/sam2tsv.jar --type g

yuxinPenny commented 2 years ago

I know where the problem is: I should use the sam2tsv in the EpiNano package file, rather than the one in my own directory.

Huanle commented 2 years ago

yup, you are right! @yuxinPenny