lpantano / seqcluster

small RNA analysis from NGS data
http://seqcluster.readthedocs.io
MIT License
35 stars 17 forks source link

miraligner Exception in thread "main" java.lang.ArrayIndexOutOfBoundsException: 1 #18

Closed hmkim closed 8 years ago

hmkim commented 8 years ago

$ cat .command.sh

!/bin/bash -ue

/BiO/BioTools/bcbio/data/anaconda/bin/miraligner -Xms705m -Xmx4500m -freq -sub 1 -trim 3 -add 3 -s hsa -i SRR950876_trimmed.fq.gz-collapse -db /BiO/BioTools/bcbio/data/genomes/Hsapiens/hg19/srnaseq -o ./

$ cat .command.sh | sh Format is not tabular,guessing fasta species found Go to mapping... Mismatches: 1 Trimming: 3 Addition: 3 Species: hsa Fri Jun 17 01:34:43 KST 2016

Reading reads Number of reads to be mapped: 374831 Searching in precursors Exception in thread "main" java.lang.ArrayIndexOutOfBoundsException: 1 at miraligner.tools.getFreq(tools.java:122) at miraligner.map.readseq(map.java:315) at miraligner.Main.main(Main.java:85)

$ head SRR950876_trimmed.fq.gz-collapse

1-1 AAAAAAAAAAAAAAAAAAAAAAA 2-1 AAAAAAAAAAAAAAAAAAAAAAAAA 3-1 AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA 4-1 AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA 5-1 AAAAAAAAAAAAAAAAAAAAAAAGG

Could you check it? Thanks.

lpantano commented 8 years ago

Hi sorry about that, any chance you can send me the fastq so I can debug locally and fix the error?

On Jun 16, 2016, at 12:40 PM, Hyunmin Kim (Brandon) notifications@github.com wrote:

$ cat .command.sh

!/bin/bash -ue

/BiO/BioTools/bcbio/data/anaconda/bin/miraligner -Xms705m -Xmx4500m -freq -sub 1 -trim 3 -add 3 -s hsa -i SRR950876_trimmed.fq.gz-collapse -db /BiO/BioTools/bcbio/data/genomes/Hsapiens/hg19/srnaseq -o ./

$ cat .command.sh | sh Format is not tabular,guessing fasta species found Go to mapping... Mismatches: 1 Trimming: 3 Addition: 3 Species: hsa Fri Jun 17 01:34:43 KST 2016

Reading reads Number of reads to be mapped: 374831 Searching in precursors Exception in thread "main" java.lang.ArrayIndexOutOfBoundsException: 1 at miraligner.tools.getFreq(tools.java:122) at miraligner.map.readseq(map.java:315) at miraligner.Main.main(Main.java:85)

$ head SRR950876_trimmed.fq.gz-collapse

1-1 AAAAAAAAAAAAAAAAAAAAAAA 2-1 AAAAAAAAAAAAAAAAAAAAAAAAA 3-1 AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA 4-1 AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA 5-1 AAAAAAAAAAAAAAAAAAAAAAAGG

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/lpantano/seqcluster/issues/18, or mute the thread https://github.com/notifications/unsubscribe/ABi_HBzXfNuc3FbdDjRUzwwb-0LnUHVIks5qMXyLgaJpZM4I3k-W.

lpantano commented 8 years ago

Hi, i see the error. You are using -freq flag, and this is only use when your names in the fasta files have the counts values in this format seq_index_xcounts. If not you just need to remove -freq form the command line.

cheers

hmkim commented 8 years ago

Thanks for your reply.

I changed to 'seq_index_xcounts' format.

but, remain error.

For example,

>seq_116705_1
CTCGGGTA
>seq_116706_3
CTATGTAATCGA

-freq flag is need the quality scale per read?

and -freq flag is how work in seqcluster ?

lpantano commented 8 years ago

needs to be: >eq_116705_x1

you are missing the ‘x’ character before the number. This should work. Can you try it.

-freq is just to say the read name has count information, that’s all. so you will have at the output a column with that number. Needed if you want to use the output with isomiRs package, but nothing else. If you want to annotate, you can ignore it.

On Jun 16, 2016, at 8:57 PM, Hyunmin Kim (Brandon) notifications@github.com wrote:

Thanks for your reply.

I changed to 'seq_index_xcounts' format.

but, remain error.

For example,

seq_116705_1 CTCGGGTA seq_116706_3 CTATGTAATCGA -freq flag is need the quality scale per read?

— You are receiving this because you modified the open/close state. Reply to this email directly, view it on GitHub https://github.com/lpantano/seqcluster/issues/18#issuecomment-226655162, or mute the thread https://github.com/notifications/unsubscribe/ABi_HK3B-yj-TsbgSyzEnWmf7gzeMsVvks5qMfD7gaJpZM4I3k-W.

hmkim commented 8 years ago
>seq_1_x1
AAAAAAAAAAAAAAAAAAAAAAA
>seq_2_x1
AAAAAAAAAAAAAAAAAAAAAAAAA
>seq_3_x1
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
>seq_4_x1
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
>seq_5_x1
AAAAAAAAAAAAAAAAAAAAAAAGG
>seq_6_x1
AAAAAAAAAAAAAAAAAAAAATAAAATTCCCTTGCCCCAATAATATC
>seq_7_x1
AAAAAAAAAAAAGTAAGGTCTGTATCTGCTGACCCCACCCTTCTCCGAAG
>seq_8_x1
AAAAAAAAAAACTTTTACGGATCTGGCTTCTGAGA
>seq_9_x1
AAAAAAAAAAAGTGCACTCTGCCTATACACC
>seq_10_x1
AAAAAAAAAAATCACTGAACCC
>seq_11_x1
AAAAAAAAAAATTCCACCACGTTCCCGTGG
>seq_12_x1
AAAAAAAAAAGATTGTGGGGC
...

I tried it with above data. but it has problem.

Could you check again please?

Thanks.

lpantano commented 8 years ago

What error you get with this?

I don't get any error if I use those names and sequences.

hmkim commented 8 years ago

sorry, my mistake. it works well.

Thanks to @lpantano