zhqingit / giremi

GIREMI is a method that can identify RNA editing sites using one RNA-seq data set without requiring genome sequence data.
42 stars 15 forks source link

error:Can't find snv in ss!i #9

Open Angelven opened 8 years ago

Angelven commented 8 years ago

Hi Qing.

I made a input.vcf containing about 660000 SNVs and about 15% were in dbSNP.

I got an error while running giremi:

             [mpileup] 1 samples in 1 input files
             error:Can't find snv in ss!i

Does it mean snv was not found in .bam file?

Also, I made a small input.vcf from the large vcf, it contains about 20000 SNVs. GIREMI seems to run well.

Best. Li

Angelven commented 8 years ago

To find the error snv, I made a smaller input.vcf (3301 snvs) and run "strace ./giremi", error occures again:

 munmap(0x2ac0bca65000, 2289664)         = 0
 munmap(0x2ac0bcc94000, 1146880)         = 0
 munmap(0x2ac0bb97c000, 17731584)        = 0
 close(4)                                = 0
 munmap(0x2ac0bcdac000, 135536640)       = 0
 fstat(1, {st_mode=S_IFREG|0600, st_size=251, ...}) = 0
 mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x2ac0bb97c000
 write(1, "coor:235473079\tchr:chr1coor:2354"..., 4096) = 4096
 write(2, "error:Can't find snv in ss!i\n", 29error:Can't find snv in ss!i

) = 29 --- SIGSEGV (Segmentation fault) @ 0 (0) --- +++ killed by SIGSEGV (core dumped) +++

So, I thought snv chr1:235473079 maybe the troublemaker, then I only extracted 1500 snvs (contain snv chr1:235473079 and snvs surrounding it) from the 3301 list, it runs well:

 munmap(0x2b9a071a7000, 4096)            = 0
 write(1, "43258455\tchr:chr1coor:243258455\t"..., 4096) = 4096
 rt_sigaction(SIGINT, {SIG_IGN, [], SA_RESTORER, 0x30894326a0}, {SIG_DFL, [], 0}, 8) = 0
 rt_sigaction(SIGQUIT, {SIG_IGN, [], SA_RESTORER, 0x30894326a0}, {SIG_DFL, [], 0}, 8) = 0
 rt_sigprocmask(SIG_BLOCK, [CHLD], [], 8) = 0
 clone(child_stack=0, flags=CLONE_PARENT_SETTID|SIGCHLD, parent_tidptr=0x7fff29ce8078) = 29120
 wait4(29120, [{WIFEXITED(s) && WEXITSTATUS(s) == 0}], 0, NULL) = 29120
 rt_sigaction(SIGINT, {SIG_DFL, [], SA_RESTORER, 0x30894326a0}, NULL, 8) = 0
 rt_sigaction(SIGQUIT, {SIG_DFL, [], SA_RESTORER, 0x30894326a0}, NULL, 8) = 0
 rt_sigprocmask(SIG_SETMASK, [], NULL, 8) = 0
 --- SIGCHLD (Child exited) @ 0 (0) ---
 write(2, "Analysis DONE!\n", 15Analysis DONE!
)        = 15
 write(1, "ore '--args finput=\"./GoldenSet/"..., 239) = 239
 exit_group(1)                           = ?

And I also got the .res file. It seems giremi can find this snv in .bam. So I wonder how this error occure and what should I do to fix it.

zhqingit commented 8 years ago

Hi Li,

Based on my experience, the error is caused by the bam file in which some reads map to more than one chromosomes. I think you can try test this point.

Best, Qing

2016-03-20 20:03 GMT-07:00 Angelven notifications@github.com:

To find the error snv, I made a smaller input.vcf (3301 snvs) and run "strace ./giremi", error occures again:

munmap(0x2ac0bca65000, 2289664) = 0 munmap(0x2ac0bcc94000, 1146880) = 0 munmap(0x2ac0bb97c000, 17731584) = 0 close(4) = 0 munmap(0x2ac0bcdac000, 135536640) = 0 fstat(1, {st_mode=S_IFREG|0600, st_size=251, ...}) = 0 mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x2ac0bb97c000 write(1, "coor:235473079\tchr:chr1coor:2354"..., 4096) = 4096 write(2, "error:Can't find snv in ss!i\n", 29error:Can't find snv in ss!i

) = 29 --- SIGSEGV (Segmentation fault) @ 0 (0) --- +++ killed by SIGSEGV (core dumped) +++

So, I thought snv chr1:235473079 maybe the troublemaker, then I only extracted 1500 snvs (contain snv chr1:235473079 and snvs surrounding it) from the 3301 list, it runs well:

munmap(0x2b9a071a7000, 4096) = 0 write(1, "43258455\tchr:chr1coor:243258455\t"..., 4096) = 4096 rt_sigaction(SIGINT, {SIG_IGN, [], SA_RESTORER, 0x30894326a0}, {SIG_DFL, [], 0}, 8) = 0 rt_sigaction(SIGQUIT, {SIG_IGN, [], SA_RESTORER, 0x30894326a0}, {SIG_DFL, [], 0}, 8) = 0 rt_sigprocmask(SIG_BLOCK, [CHLD], [], 8) = 0 clone(child_stack=0, flags=CLONE_PARENT_SETTID|SIGCHLD, parent_tidptr=0x7fff29ce8078) = 29120 wait4(29120, [{WIFEXITED(s) && WEXITSTATUS(s) == 0}], 0, NULL) = 29120 rt_sigaction(SIGINT, {SIG_DFL, [], SA_RESTORER, 0x30894326a0}, NULL, 8) = 0 rt_sigaction(SIGQUIT, {SIG_DFL, [], SA_RESTORER, 0x30894326a0}, NULL, 8) = 0 rt_sigprocmask(SIG_SETMASK, [], NULL, 8) = 0 --- SIGCHLD (Child exited) @ 0 (0) --- write(2, "Analysis DONE!\n", 15Analysis DONE! ) = 15 write(1, "ore '--args finput=\"./GoldenSet/"..., 239) = 239 exit_group(1) = ?

And I also got the .res file. It seems giremi can find this snv in .bam. So I wonder how this error occure and what should I do to fix it.

— You are receiving this because you are subscribed to this thread. Reply to this email directly or view it on GitHub https://github.com/zhqingit/giremi/issues/9#issuecomment-199093072

Angelven commented 8 years ago

Hi Qing,

You get the right point. I split my input.vcf into 24 chromosomes, and each of them runs well.

Because giremi will calculate MI first from the total set of snvs and it is improper to split my input, so I will re-filter my input.bam file.

Thank you.

Best, Li