lpantano / seqbuster

pipeline for the analysis of small RNA data
14 stars 3 forks source link

A issue with miraligner #25

Open wangjingshen opened 5 years ago

wangjingshen commented 5 years ago

Hi detail of the issue is

$ java -jar miraligner.jar -sub 1 -trim 3 -add 3 -s hsa -i test/test.fa -db DB -o a Format is not tabular,guessing fasta species found Go to mapping... Mismatches: 1 Trimming: 3 Addition: 3 Species: hsa ... Exception in thread "main" java.lang.ArrayIndexOutOfBoundsException: 1 at miraligner.tools.getFreq(tools.java:122) at miraligner.map.readseq(map.java:296) at miraligner.Main.main(Main.java:99)

The test.fa is downloaded from https://github.com/lpantano/seqbuster/tree/miraligner/miraligner/test and,my java version is: openjdk version "1.8.0_151" OpenJDK Runtime Environment (build 1.8.0_151-b12) OpenJDK 64-Bit Server VM (build 25.151-b12, mixed mode)

Thank you !

wangjingshen commented 5 years ago

sorry,when i update the java

java version "11.0.1" 2018-10-16 LTS Java(TM) SE Runtime Environment 18.9 (build 11.0.1+13-LTS) Java HotSpot(TM) 64-Bit Server VM 18.9 (build 11.0.1+13-LTS, mixed mode)

a new issue: Exception in thread "main" java.lang.ArrayIndexOutOfBoundsException: Index 1 out of bounds for length 1 at miraligner.tools.getFreq(tools.java:122) at miraligner.map.readseq(map.java:296) at miraligner.Main.main(Main.java:99)

thanks a lot !

lpantano commented 5 years ago

Hi

sorry about the issue.

can you double check the binary source file is coming from here:

https://github.com/lpantano/seqbuster/raw/miraligner/modules/miraligner/miraligner.jar

and can you use this as the test:

https://github.com/lpantano/seqbuster/blob/miraligner/validator/sim.21.hsa.fa

It should work with java 1.8

Cheers

On Tue, Nov 13, 2018 at 7:58 AM kianaknight notifications@github.com wrote:

sorry,when i update the java

java version "11.0.1" 2018-10-16 LTS Java(TM) SE Runtime Environment 18.9 (build 11.0.1+13-LTS) Java HotSpot(TM) 64-Bit Server VM 18.9 (build 11.0.1+13-LTS, mixed mode)

a new issue: Exception in thread "main" java.lang.ArrayIndexOutOfBoundsException: Index 1 out of bounds for length 1 at miraligner.tools.getFreq(tools.java:122) at miraligner.map.readseq(map.java:296) at miraligner.Main.main(Main.java:99)

thanks a lot !

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/lpantano/seqbuster/issues/25#issuecomment-438256878, or mute the thread https://github.com/notifications/unsubscribe-auth/ABi_HMDZivznnilgQ_xzIPJeSVcxTrBbks5uusIFgaJpZM4Ya55e .

wangjingshen commented 5 years ago

Hi thank you for your reply. I'm so sorry for the late reply, because my lab's server does't work

And now, the test file make it Reading reads Nov 15, 2018 7:34:42 PM miraligner.map readseq INFO: Number of reads to be mapped: 17858 Nov 15, 2018 7:34:42 PM miraligner.map readseq INFO: Searching in precursors Nov 15, 2018 7:34:44 PM miraligner.map readseq INFO: Thu Nov 15 19:34:44 CST 2018 Nov 15, 2018 7:34:44 PM miraligner.map readseq INFO: Num reads annotated: 17183

But,the data (SRX856896) I need to process comes the issue I use trim_galore(https://www.bioinformatics.babraham.ac.uk/projects/trim_galore/) to cut adapter Then I transform fastq to fasta .The fasta file size is 791M ( it may be a cause of the issue?)

detail of the issue is: $ java -jar miraligner.jar -sub 1 -trim 3 -add 3 -s hsa -i SRX856896.fa -db DB -o test Reading reads Nov 15, 2018 5:14:00 PM miraligner.map readseq INFO: Number of reads to be mapped: 8341908 Nov 15, 2018 5:14:00 PM miraligner.map readseq INFO: Searching in precursors Exception in thread "main" java.lang.ArrayIndexOutOfBoundsException: 1 at miraligner.tools.getFreq(tools.java:122) at miraligner.map.readseq(map.java:296) at miraligner.Main.main(Main.java:99)

Thanks a lot

lpantano commented 5 years ago

Hi,

That is a memory problem. There are different ways to solve that, but normally you collapse your data first, you can follow this instruction:

https://seqcluster.readthedocs.io/collapse.html

You end up with something like this:

seq_index_x100 SEQUENCE

where x100 means the sequence is 100 times in your data.

To increase the memory you add -Xms750m -Xmx16g after java -jar

I hope this helps

cheers

On Thu, Nov 15, 2018 at 12:31 AM kianaknight notifications@github.com wrote:

Hi Thank you for your reply. I'm so sorry for my late reply, because my lab's server does not work. The test file make it. But,a new issue arises.

My raw data is SRX856896.sra Then I use trim_galore( https://www.bioinformatics.babraham.ac.uk/projects/trim_galore/) to cut the adapter. The code is: trim_galore SRX856896.fastq --length 10 java -jar miraligner.jar -sub 1 -trim 3 -add 3 -s hsa -i SRX856896_trimmed.fq -db DB -o test

detail of the issue is: Reading reads Exception in thread "main" java.lang.OutOfMemoryError: GC overhead limit exceeded at java.util.Arrays.copyOfRange(Arrays.java:3664) at java.lang.StringBuffer.toString(StringBuffer.java:669) at java.util.regex.Matcher.replaceAll(Matcher.java:959) at java.lang.String.replace(String.java:2240) at miraligner.tools.getseq(tools.java:87) at miraligner.map.readseq(map.java:86) at miraligner.Main.main(Main.java:99)

Thanks

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/lpantano/seqbuster/issues/25#issuecomment-438923004, or mute the thread https://github.com/notifications/unsubscribe-auth/ABi_HNUOulL5jlEIACL632QX0tGuJKvcks5uvPw_gaJpZM4Ya55e .

wangjingshen commented 5 years ago

Hi I make it. Thanks a lot for your kind reply!