zhaoyanswill / RAPSearch2

Reduced Alphabet based Protein similarity Search
40 stars 15 forks source link

RapSearch2 v2.12 paired .aln files cannot be read by MEGAN 5.3.0 #5

Open MDSharma opened 10 years ago

MDSharma commented 10 years ago

Hello,

I will be raising this issue with the MEGAN authors anyway but wanted to make sure I ran this by you first. I have successfully used v2.12 to run paired end blast searches against the nr database however, when I try to import the pir.aln or the pir.m8 files into MEGAN 5.3.0 it crashes citing - what seems to be - format related issues. Please see the error messages below:

Executing: import blastFile=xxxx_data/rap_results/LM_4601/LM_4601.rap.pir.aln meganFile=xxxx_data/rap_results/LM_4601/LM_4601.rap.pir.aln.rma minScore=50.0 maxExpected=1.0 topPercent=10 minSupport=5 minComplexity=0.0 useMinimalCoverageHeuristic=false useSeed=false useCOG=false useKegg=true paired=false useIdentityFilter=false textStoragePolicy=0 blastFormat=RapSearch mapping='Taxonomy:GI_MAP=true,KEGG:GI_MAP=true'; Importing data: Importing data: 0 reads file(s), 1 blast file(s) Input format: RapSearch TextStoragePolicy: Embed matches and reads in MEGAN file Will stream through reads, not load them into memory (assumes that reads occur in same order in BLAST and FASTA files) Processing LM_4601.rap.pir.aln Processing LM_4601.rap.pir.aln Processing RapSearch file(s) Parse error: java.io.IOException: Failed to parse 'bits=' in: >M02023:17:000000000-A53F2:1:1101:16405:1365[5] vs gi|488585939|ref|XP_004478316.1| bits=28.4906[28.4906] logSum(E-value)=0.149291 log(E-value)=3.03[3.03] identity=75%[75]% aln-len=16[16] mismatch=4[4] gap-openings=0[0] nFrame=1[4] Parse error: java.io.IOException: Token '>' not found at start of line: Query: 44 ELEKDDLGYLVEEISK 91 89 ELEKDDLGYLVEEISK 42 Parse error: java.io.IOException: Token '>' not found at start of line: ELE ++LGYL EEISK ELE ++LGYL EEISK Parse error: java.io.IOException: Token '>' not found at start of line: Sbjct: 563 ELESEELGYLAEEISK 578 563 ELESEELGYLAEEISK 578 Parse error: java.io.IOException: Failed to parse 'bits=' in: >M02023:17:000000000-A53F2:1:1101:16405:1365[5] vs gi|565835107|ref|WP_023918686.1| bits=27.335[27.335] logSum(E-value)=0.809676 log(E-value)=3.38[3.38] identity=84.6154%[84.6154]% aln-len=13[13] mismatch=2[2] gap-openings=0[0] nFrame=2[5]

Any thoughts on this?

Cheers, MD

zhaoyanswill commented 10 years ago

Hi,

It looks that MEGAN wants .m8 file rather than .aln file. Thanks!

Sincerely, Yongan

On 6/16/2014 6:47 PM, MD Sharma wrote:

Hello,

I will be raising this issue with the MEGAN authors anyway but wanted to make sure I ran this by you first. I have successfully used v2.12 to run paired end blast searches against the nr database however, when I try to import the pir.aln or the pir.m8 files into MEGAN 5.3.0 it crashes citing - what seems to be - format related issues. Please see the error messages below:

Executing: import blastFile=xxxx_data/rap_results/LM_4601/LM_4601.rap.pir.aln meganFile=xxxx_data/rap_results/LM_4601/LM_4601.rap.pir.aln.rma minScore=50.0 maxExpected=1.0 topPercent=10 minSupport=5 minComplexity=0.0 useMinimalCoverageHeuristic=false useSeed=false useCOG=false useKegg=true paired=false useIdentityFilter=false textStoragePolicy=0 blastFormat=RapSearch mapping='Taxonomy:GI_MAP=true,KEGG:GI_MAP=true'; Importing data: Importing data: 0 reads file(s), 1 blast file(s) Input format: RapSearch TextStoragePolicy: Embed matches and reads in MEGAN file Will stream through reads, not load them into memory (assumes that reads occur in same order in BLAST and FASTA files) Processing LM_4601.rap.pir.aln Processing LM_4601.rap.pir.aln Processing RapSearch file(s) Parse error: java.io.IOException: Failed to parse 'bits=' in:

M02023:17:000000000-A53F2:1:1101:16405:1365[5] vs gi|488585939|ref|XP_004478316.1| bits=28.4906[28.4906] logSum(E-value)=0.149291 log(E-value)=3.03[3.03] identity=75%[75]% aln-len=16[16] mismatch=4[4] gap-openings=0[0] nFrame=1[4] Parse error: java.io.IOException: Token '>' not found at start of line: Query: 44 ELEKDDLGYLVEEISK 91 89 ELEKDDLGYLVEEISK 42 Parse error: java.io.IOException: Token '>' not found at start of line: ELE ++LGYL EEISK ELE ++LGYL EEISK Parse error: java.io.IOException: Token '>' not found at start of line: Sbjct: 563 ELESEELGYLAEEISK 578 563 ELESEELGYLAEEISK 578 Parse error: java.io.IOException: Failed to parse 'bits=' in: M02023:17:000000000-A53F2:1:1101:16405:1365[5] vs gi|565835107|ref|WP_023918686.1| bits=27.335[27.335] logSum(E-value)=0.809676 log(E-value)=3.38[3.38] identity=84.6154%[84.6154]% aln-len=13[13] mismatch=2[2] gap-openings=0[0] nFrame=2[5]

Any thoughts on this?

Cheers, MD

— Reply to this email directly or view it on GitHub https://github.com/zhaoyanswill/RAPSearch2/issues/5.

MDSharma commented 10 years ago

It can't be just that.. All single read .aln files work with MEGAN.. The .pir.aln (paired) files generate the message I shared earlier. Thoughts?

zhaoyanswill notifications@github.com wrote:

Hi,

It looks that MEGAN wants .m8 file rather than .aln file. Thanks!

Sincerely, Yongan

On 6/16/2014 6:47 PM, MD Sharma wrote:

Hello,

I will be raising this issue with the MEGAN authors anyway but wanted to make sure I ran this by you first. I have successfully used v2.12 to run paired end blast searches against the nr database however, when I try to import the pir.aln or the pir.m8 files into MEGAN 5.3.0 it crashes citing - what seems to be - format related issues. Please see the error messages below:

Executing: import blastFile=xxxx_data/rap_results/LM_4601/LM_4601.rap.pir.aln meganFile=xxxx_data/rap_results/LM_4601/LM_4601.rap.pir.aln.rma minScore=50.0 maxExpected=1.0 topPercent=10 minSupport=5 minComplexity=0.0 useMinimalCoverageHeuristic=false useSeed=false useCOG=false useKegg=true paired=false useIdentityFilter=false textStoragePolicy=0 blastFormat=RapSearch mapping='Taxonomy:GI_MAP=true,KEGG:GI_MAP=true'; Importing data: Importing data: 0 reads file(s), 1 blast file(s) Input format: RapSearch TextStoragePolicy: Embed matches and reads in MEGAN file Will stream through reads, not load them into memory (assumes that reads occur in same order in BLAST and FASTA files) Processing LM_4601.rap.pir.aln Processing LM_4601.rap.pir.aln Processing RapSearch file(s) Parse error: java.io.IOException: Failed to parse 'bits=' in:

M02023:17:000000000-A53F2:1:1101:16405:1365[5] vs gi|488585939|ref|XP_004478316.1| bits=28.4906[28.4906] logSum(E-value)=0.149291 log(E-value)=3.03[3.03] identity=75%[75]% aln-len=16[16] mismatch=4[4] gap-openings=0[0] nFrame=1[4] Parse error: java.io.IOException: Token '>' not found at start of line: Query: 44 ELEKDDLGYLVEEISK 91 89 ELEKDDLGYLVEEISK 42 Parse error: java.io.IOException: Token '>' not found at start of line: ELE ++LGYL EEISK ELE ++LGYL EEISK Parse error: java.io.IOException: Token '>' not found at start of line: Sbjct: 563 ELESEELGYLAEEISK 578 563 ELESEELGYLAEEISK 578 Parse error: java.io.IOException: Failed to parse 'bits=' in: M02023:17:000000000-A53F2:1:1101:16405:1365[5] vs gi|565835107|ref|WP_023918686.1| bits=27.335[27.335] logSum(E-value)=0.809676 log(E-value)=3.38[3.38] identity=84.6154%[84.6154]% aln-len=13[13] mismatch=2[2] gap-openings=0[0] nFrame=2[5]

Any thoughts on this?

Cheers, MD

— Reply to this email directly or view it on GitHub https://github.com/zhaoyanswill/RAPSearch2/issues/5.

— Reply to this email directly or view it on GitHubhttps://github.com/zhaoyanswill/RAPSearch2/issues/5#issuecomment-46260861.

zhaoyanswill commented 10 years ago

Hi,

Paired alignment file has different format to show information of two alignments. I'm not sure if MEGAN supports it! Thanks!

Sincerely, Yongan

On 6/17/2014 1:33 AM, MD Sharma wrote:

It can't be just that.. All single read .aln files work with MEGAN.. The .pir.aln (paired) files generate the message I shared earlier. Thoughts?

zhaoyanswill notifications@github.com wrote:

Hi,

It looks that MEGAN wants .m8 file rather than .aln file. Thanks!

Sincerely, Yongan

On 6/16/2014 6:47 PM, MD Sharma wrote:

Hello,

I will be raising this issue with the MEGAN authors anyway but wanted to make sure I ran this by you first. I have successfully used v2.12 to run paired end blast searches against the nr database however, when I try to import the pir.aln or the pir.m8 files into MEGAN 5.3.0 it crashes citing - what seems to be - format related issues. Please see the error messages below:

Executing: import blastFile=xxxx_data/rap_results/LM_4601/LM_4601.rap.pir.aln meganFile=xxxx_data/rap_results/LM_4601/LM_4601.rap.pir.aln.rma minScore=50.0 maxExpected=1.0 topPercent=10 minSupport=5 minComplexity=0.0 useMinimalCoverageHeuristic=false useSeed=false useCOG=false useKegg=true paired=false useIdentityFilter=false textStoragePolicy=0 blastFormat=RapSearch mapping='Taxonomy:GI_MAP=true,KEGG:GI_MAP=true'; Importing data: Importing data: 0 reads file(s), 1 blast file(s) Input format: RapSearch TextStoragePolicy: Embed matches and reads in MEGAN file Will stream through reads, not load them into memory (assumes that reads occur in same order in BLAST and FASTA files) Processing LM_4601.rap.pir.aln Processing LM_4601.rap.pir.aln Processing RapSearch file(s) Parse error: java.io.IOException: Failed to parse 'bits=' in:

M02023:17:000000000-A53F2:1:1101:16405:1365[5] vs gi|488585939|ref|XP_004478316.1| bits=28.4906[28.4906] logSum(E-value)=0.149291 log(E-value)=3.03[3.03] identity=75%[75]% aln-len=16[16] mismatch=4[4] gap-openings=0[0] nFrame=1[4] Parse error: java.io.IOException: Token '>' not found at start of line: Query: 44 ELEKDDLGYLVEEISK 91 89 ELEKDDLGYLVEEISK 42 Parse error: java.io.IOException: Token '>' not found at start of line: ELE ++LGYL EEISK ELE ++LGYL EEISK Parse error: java.io.IOException: Token '>' not found at start of line: Sbjct: 563 ELESEELGYLAEEISK 578 563 ELESEELGYLAEEISK 578 Parse error: java.io.IOException: Failed to parse 'bits=' in: M02023:17:000000000-A53F2:1:1101:16405:1365[5] vs gi|565835107|ref|WP_023918686.1| bits=27.335[27.335] logSum(E-value)=0.809676 log(E-value)=3.38[3.38] identity=84.6154%[84.6154]% aln-len=13[13] mismatch=2[2] gap-openings=0[0] nFrame=2[5]

Any thoughts on this?

Cheers, MD

— Reply to this email directly or view it on GitHub https://github.com/zhaoyanswill/RAPSearch2/issues/5.

— Reply to this email directly or view it on GitHubhttps://github.com/zhaoyanswill/RAPSearch2/issues/5#issuecomment-46260861.

— Reply to this email directly or view it on GitHub https://github.com/zhaoyanswill/RAPSearch2/issues/5#issuecomment-46268887.

MDSharma commented 10 years ago

Hello,

I took R1 and R2 files from a paired end read and merged them into a single R12 file. Then ran this through Rapsearch2 to get a .aln and a .m8 file. Every single time I try to load either of these files in MEGAN 5.3, I keep getting this error:

Parse error: java.io.IOException: Token '>' not found at start of line: M02023:25:000000000-A6UEU:1:1101:16095:1392

Obviously here the line number would differ depending on the sequence ID etc. I have verified that the source fasta files do have a “>” at the start.. for e.g.:

M02023:25:000000000-A6UEU:1:1101:17158:1732 1:N:0:23 GAATGGAATGGAATGGGGTGGAATAGAATGGAGTGGAGTGC

However, the .aln files generated via Rapsearch2.18 do not seem to have the “>” at the start of each new line: M02023:25:000000000-A6UEU:1:1101:16095:1392

Any thoughts?

Best, MD

Dr. M D Sharma Associate Research Fellow Centre for Ecology & Conservation College of Life and Environmental Sciences University of Exeter Cornwall Campus Penryn TR10 9EZ

M.D.Sharma@Exeter.ac.ukhttps://legacy.exeter.ac.uk/owa/redir.aspx?C=2550a33f6b904272a7b1147002360f76&URL=mailto%3aM.D.Sharma%40Exeter.ac.uk http://www.publicationslist.org/MD.Sharmahttps://legacy.exeter.ac.uk/owa/redir.aspx?C=2550a33f6b904272a7b1147002360f76&URL=http%3a%2f%2fwww.publicationslist.org%2fMD.Sharma http://www.researcherid.com/rid/F-8530-2013

Shared Tel: (+44) 1326 259384 Mob: (+44) 7919 242450 [http://www.exeter.ac.uk/codebox/email-sig/images/logo.gif]http://www.exeter.ac.uk/

[http://www.exeter.ac.uk/codebox/email-sig/images/fb.gif]http://www.facebook.com/exeteruni[http://www.exeter.ac.uk/codebox/email-sig/images/twitter.gif]http://twitter.com/uniofexeter[http://www.exeter.ac.uk/codebox/email-sig/images/youtube.gif]http://www.youtube.com/universityofexeter[http://www.exeter.ac.uk/codebox/email-sig/images/li.gif]http://www.linkedin.com/groups/University-Exeter-109267?mostPopular=&gid=109267

This email and any attachment may contain information that is confidential, privileged, or subject to copyright, and which may be exempt from disclosure under applicable legislation. It is intended for the addressee only. If you received this message in error, please let me know and delete the email and any attachments immediately. The University will not accept responsibility for the accuracy/completeness of this e-mail and its attachments.

From: zhaoyanswill [mailto:notifications@github.com] Sent: 17 June 2014 19:41 To: zhaoyanswill/RAPSearch2 Cc: Sharma, M D Subject: Re: [RAPSearch2] RapSearch2 v2.12 paired .aln files cannot be read by MEGAN 5.3.0 (#5)

Hi,

Paired alignment file has different format to show information of two alignments. I'm not sure if MEGAN supports it! Thanks!

Sincerely, Yongan

On 6/17/2014 1:33 AM, MD Sharma wrote:

It can't be just that.. All single read .aln files work with MEGAN.. The .pir.aln (paired) files generate the message I shared earlier. Thoughts?

zhaoyanswill notifications@github.com<mailto:notifications@github.com> wrote:

Hi,

It looks that MEGAN wants .m8 file rather than .aln file. Thanks!

Sincerely, Yongan

On 6/16/2014 6:47 PM, MD Sharma wrote:

Hello,

I will be raising this issue with the MEGAN authors anyway but wanted to make sure I ran this by you first. I have successfully used v2.12 to run paired end blast searches against the nr database however, when I try to import the pir.aln or the pir.m8 files into MEGAN 5.3.0 it crashes citing - what seems to be - format related issues. Please see the error messages below:

Executing: import blastFile=xxxx_data/rap_results/LM_4601/LM_4601.rap.pir.aln meganFile=xxxx_data/rap_results/LM_4601/LM_4601.rap.pir.aln.rma minScore=50.0 maxExpected=1.0 topPercent=10 minSupport=5 minComplexity=0.0 useMinimalCoverageHeuristic=false useSeed=false useCOG=false useKegg=true paired=false useIdentityFilter=false textStoragePolicy=0 blastFormat=RapSearch mapping='Taxonomy:GI_MAP=true,KEGG:GI_MAP=true'; Importing data: Importing data: 0 reads file(s), 1 blast file(s) Input format: RapSearch TextStoragePolicy: Embed matches and reads in MEGAN file Will stream through reads, not load them into memory (assumes that reads occur in same order in BLAST and FASTA files) Processing LM_4601.rap.pir.aln Processing LM_4601.rap.pir.aln Processing RapSearch file(s) Parse error: java.io.IOException: Failed to parse 'bits=' in:

M02023:17:000000000-A53F2:1:1101:16405:1365[5] vs gi|488585939|ref|XP_004478316.1| bits=28.4906[28.4906] logSum(E-value)=0.149291 log(E-value)=3.03[3.03] identity=75%[75]% aln-len=16[16] mismatch=4[4] gap-openings=0[0] nFrame=1[4] Parse error: java.io.IOException: Token '>' not found at start of line: Query: 44 ELEKDDLGYLVEEISK 91 89 ELEKDDLGYLVEEISK 42 Parse error: java.io.IOException: Token '>' not found at start of line: ELE ++LGYL EEISK ELE ++LGYL EEISK Parse error: java.io.IOException: Token '>' not found at start of line: Sbjct: 563 ELESEELGYLAEEISK 578 563 ELESEELGYLAEEISK 578 Parse error: java.io.IOException: Failed to parse 'bits=' in: M02023:17:000000000-A53F2:1:1101:16405:1365[5] vs gi|565835107|ref|WP_023918686.1| bits=27.335[27.335] logSum(E-value)=0.809676 log(E-value)=3.38[3.38] identity=84.6154%[84.6154]% aln-len=13[13] mismatch=2[2] gap-openings=0[0] nFrame=2[5]

Any thoughts on this?

Cheers, MD

— Reply to this email directly or view it on GitHub https://github.com/zhaoyanswill/RAPSearch2/issues/5.

— Reply to this email directly or view it on GitHubhttps://github.com/zhaoyanswill/RAPSearch2/issues/5#issuecomment-46260861.

— Reply to this email directly or view it on GitHub https://github.com/zhaoyanswill/RAPSearch2/issues/5#issuecomment-46268887.

— Reply to this email directly or view it on GitHubhttps://github.com/zhaoyanswill/RAPSearch2/issues/5#issuecomment-46347887.

zhaoyanswill commented 10 years ago

Hi,

Thank you! I just noticed this bug. I'll fix it as soon as possible.

You may want to try previous version of RAPSearch2 or write a script to add '>' in the first line of every file lines of .aln file.

Sincerely, Yongan

On 7/16/2014 10:52 PM, MD Sharma wrote:

Hello,

I took R1 and R2 files from a paired end read and merged them into a single R12 file. Then ran this through Rapsearch2 to get a .aln and a .m8 file. Every single time I try to load either of these files in MEGAN 5.3, I keep getting this error:

Parse error: java.io.IOException: Token '>' not found at start of line: M02023:25:000000000-A6UEU:1:1101:16095:1392

Obviously here the line number would differ depending on the sequence ID etc. I have verified that the source fasta files do have a “>” at the start.. for e.g.:

M02023:25:000000000-A6UEU:1:1101:17158:1732 1:N:0:23 GAATGGAATGGAATGGGGTGGAATAGAATGGAGTGGAGTGC

However, the .aln files generated via Rapsearch2.18 do not seem to have the “>” at the start of each new line: M02023:25:000000000-A6UEU:1:1101:16095:1392

Any thoughts?

Best, MD

Dr. M D Sharma Associate Research Fellow Centre for Ecology & Conservation College of Life and Environmental Sciences University of Exeter Cornwall Campus Penryn TR10 9EZ

M.D.Sharma@Exeter.ac.ukhttps://legacy.exeter.ac.uk/owa/redir.aspx?C=2550a33f6b904272a7b1147002360f76&URL=mailto%3aM.D.Sharma%40Exeter.ac.uk

http://www.publicationslist.org/MD.Sharmahttps://legacy.exeter.ac.uk/owa/redir.aspx?C=2550a33f6b904272a7b1147002360f76&URL=http%3a%2f%2fwww.publicationslist.org%2fMD.Sharma

http://www.researcherid.com/rid/F-8530-2013

Shared Tel: (+44) 1326 259384 Mob: (+44) 7919 242450 [http://www.exeter.ac.uk/codebox/email-sig/images/logo.gif]http://www.exeter.ac.uk/

[http://www.exeter.ac.uk/codebox/email-sig/images/fb.gif]http://www.facebook.com/exeteruni[http://www.exeter.ac.uk/codebox/email-sig/images/twitter.gif]http://twitter.com/uniofexeter[http://www.exeter.ac.uk/codebox/email-sig/images/youtube.gif]http://www.youtube.com/universityofexeter[http://www.exeter.ac.uk/codebox/email-sig/images/li.gif]http://www.linkedin.com/groups/University-Exeter-109267?mostPopular=&gid=109267

This email and any attachment may contain information that is confidential, privileged, or subject to copyright, and which may be exempt from disclosure under applicable legislation. It is intended for the addressee only. If you received this message in error, please let me know and delete the email and any attachments immediately. The University will not accept responsibility for the accuracy/completeness of this e-mail and its attachments.

From: zhaoyanswill [mailto:notifications@github.com] Sent: 17 June 2014 19:41 To: zhaoyanswill/RAPSearch2 Cc: Sharma, M D Subject: Re: [RAPSearch2] RapSearch2 v2.12 paired .aln files cannot be read by MEGAN 5.3.0 (#5)

Hi,

Paired alignment file has different format to show information of two alignments. I'm not sure if MEGAN supports it! Thanks!

Sincerely, Yongan

On 6/17/2014 1:33 AM, MD Sharma wrote:

It can't be just that.. All single read .aln files work with MEGAN.. The .pir.aln (paired) files generate the message I shared earlier. Thoughts?

zhaoyanswill notifications@github.com<mailto:notifications@github.com> wrote:

Hi,

It looks that MEGAN wants .m8 file rather than .aln file. Thanks!

Sincerely, Yongan

On 6/16/2014 6:47 PM, MD Sharma wrote:

Hello,

I will be raising this issue with the MEGAN authors anyway but wanted to make sure I ran this by you first. I have successfully used v2.12 to run paired end blast searches against the nr database however, when I try to import the pir.aln or the pir.m8 files into MEGAN 5.3.0 it crashes citing - what seems to be - format related issues. Please see the error messages below:

Executing: import blastFile=xxxx_data/rap_results/LM_4601/LM_4601.rap.pir.aln meganFile=xxxx_data/rap_results/LM_4601/LM_4601.rap.pir.aln.rma minScore=50.0 maxExpected=1.0 topPercent=10 minSupport=5 minComplexity=0.0 useMinimalCoverageHeuristic=false useSeed=false useCOG=false useKegg=true paired=false useIdentityFilter=false textStoragePolicy=0 blastFormat=RapSearch mapping='Taxonomy:GI_MAP=true,KEGG:GI_MAP=true'; Importing data: Importing data: 0 reads file(s), 1 blast file(s) Input format: RapSearch TextStoragePolicy: Embed matches and reads in MEGAN file Will stream through reads, not load them into memory (assumes that reads occur in same order in BLAST and FASTA files) Processing LM_4601.rap.pir.aln Processing LM_4601.rap.pir.aln Processing RapSearch file(s) Parse error: java.io.IOException: Failed to parse 'bits=' in:

M02023:17:000000000-A53F2:1:1101:16405:1365[5] vs gi|488585939|ref|XP_004478316.1| bits=28.4906[28.4906] logSum(E-value)=0.149291 log(E-value)=3.03[3.03] identity=75%[75]% aln-len=16[16] mismatch=4[4] gap-openings=0[0] nFrame=1[4] Parse error: java.io.IOException: Token '>' not found at start of line: Query: 44 ELEKDDLGYLVEEISK 91 89 ELEKDDLGYLVEEISK 42 Parse error: java.io.IOException: Token '>' not found at start of line: ELE ++LGYL EEISK ELE ++LGYL EEISK Parse error: java.io.IOException: Token '>' not found at start of line: Sbjct: 563 ELESEELGYLAEEISK 578 563 ELESEELGYLAEEISK 578 Parse error: java.io.IOException: Failed to parse 'bits=' in: M02023:17:000000000-A53F2:1:1101:16405:1365[5] vs gi|565835107|ref|WP_023918686.1| bits=27.335[27.335] logSum(E-value)=0.809676 log(E-value)=3.38[3.38] identity=84.6154%[84.6154]% aln-len=13[13] mismatch=2[2] gap-openings=0[0] nFrame=2[5]

Any thoughts on this?

Cheers, MD

— Reply to this email directly or view it on GitHub https://github.com/zhaoyanswill/RAPSearch2/issues/5.

— Reply to this email directly or view it on

GitHubhttps://github.com/zhaoyanswill/RAPSearch2/issues/5#issuecomment-46260861.

— Reply to this email directly or view it on GitHub

https://github.com/zhaoyanswill/RAPSearch2/issues/5#issuecomment-46268887.

— Reply to this email directly or view it on GitHubhttps://github.com/zhaoyanswill/RAPSearch2/issues/5#issuecomment-46347887.

— Reply to this email directly or view it on GitHub https://github.com/zhaoyanswill/RAPSearch2/issues/5#issuecomment-49253189.

MDSharma commented 10 years ago

No worries.. glad to have found it.. I was going mental thinking that I was doing something bizarre!

Dr. M D Sharma Associate Research Fellow Centre for Ecology & Conservation College of Life and Environmental Sciences University of Exeter Cornwall Campus Penryn TR10 9EZ

M.D.Sharma@Exeter.ac.ukhttps://legacy.exeter.ac.uk/owa/redir.aspx?C=2550a33f6b904272a7b1147002360f76&URL=mailto%3aM.D.Sharma%40Exeter.ac.uk http://www.publicationslist.org/MD.Sharmahttps://legacy.exeter.ac.uk/owa/redir.aspx?C=2550a33f6b904272a7b1147002360f76&URL=http%3a%2f%2fwww.publicationslist.org%2fMD.Sharma http://www.researcherid.com/rid/F-8530-2013

Shared Tel: (+44) 1326 259384 Mob: (+44) 7919 242450 [http://www.exeter.ac.uk/codebox/email-sig/images/logo.gif]http://www.exeter.ac.uk/

[http://www.exeter.ac.uk/codebox/email-sig/images/fb.gif]http://www.facebook.com/exeteruni[http://www.exeter.ac.uk/codebox/email-sig/images/twitter.gif]http://twitter.com/uniofexeter[http://www.exeter.ac.uk/codebox/email-sig/images/youtube.gif]http://www.youtube.com/universityofexeter[http://www.exeter.ac.uk/codebox/email-sig/images/li.gif]http://www.linkedin.com/groups/University-Exeter-109267?mostPopular=&gid=109267

This email and any attachment may contain information that is confidential, privileged, or subject to copyright, and which may be exempt from disclosure under applicable legislation. It is intended for the addressee only. If you received this message in error, please let me know and delete the email and any attachments immediately. The University will not accept responsibility for the accuracy/completeness of this e-mail and its attachments.

From: zhaoyanswill [mailto:notifications@github.com] Sent: 17 July 2014 04:42 To: zhaoyanswill/RAPSearch2 Cc: Sharma, M D Subject: Re: [RAPSearch2] RapSearch2 v2.12 paired .aln files cannot be read by MEGAN 5.3.0 (#5)

Hi,

Thank you! I just noticed this bug. I'll fix it as soon as possible.

You may want to try previous version of RAPSearch2 or write a script to add '>' in the first line of every file lines of .aln file.

Sincerely, Yongan

On 7/16/2014 10:52 PM, MD Sharma wrote:

Hello,

I took R1 and R2 files from a paired end read and merged them into a single R12 file. Then ran this through Rapsearch2 to get a .aln and a .m8 file. Every single time I try to load either of these files in MEGAN 5.3, I keep getting this error:

Parse error: java.io.IOException: Token '>' not found at start of line: M02023:25:000000000-A6UEU:1:1101:16095:1392

Obviously here the line number would differ depending on the sequence ID etc. I have verified that the source fasta files do have a “>” at the start.. for e.g.:

M02023:25:000000000-A6UEU:1:1101:17158:1732 1:N:0:23 GAATGGAATGGAATGGGGTGGAATAGAATGGAGTGGAGTGC

However, the .aln files generated via Rapsearch2.18 do not seem to have the “>” at the start of each new line: M02023:25:000000000-A6UEU:1:1101:16095:1392

Any thoughts?

Best, MD

Dr. M D Sharma Associate Research Fellow Centre for Ecology & Conservation College of Life and Environmental Sciences University of Exeter Cornwall Campus Penryn TR10 9EZ

M.D.Sharma@Exeter.ac.ukhttps://legacy.exeter.ac.uk/owa/redir.aspx?C=2550a33f6b904272a7b1147002360f76&URL=mailto%3aM.D.Sharma%40Exeter.ac.ukmailto:M.D.Sharma@Exeter.ac.uk%3chttps://legacy.exeter.ac.uk/owa/redir.aspx?C=2550a33f6b904272a7b1147002360f76&URL=mailto%3aM.D.Sharma%40Exeter.ac.uk

http://www.publicationslist.org/MD.Sharmahttps://legacy.exeter.ac.uk/owa/redir.aspx?C=2550a33f6b904272a7b1147002360f76&URL=http%3a%2f%2fwww.publicationslist.org%2fMD.Sharmahttp://www.publicationslist.org/MD.Sharma%3chttps:/legacy.exeter.ac.uk/owa/redir.aspx?C=2550a33f6b904272a7b1147002360f76&URL=http%3a%2f%2fwww.publicationslist.org%2fMD.Sharma

http://www.researcherid.com/rid/F-8530-2013

Shared Tel: (+44) 1326 259384 Mob: (+44) 7919 242450 [http://www.exeter.ac.uk/codebox/email-sig/images/logo.gif]http://www.exeter.ac.uk/

[http://www.exeter.ac.uk/codebox/email-sig/images/fb.gif]http://www.facebook.com/exeteruni[http://www.exeter.ac.uk/codebox/email-sig/images/twitter.gif]http://twitter.com/uniofexeter[http://www.exeter.ac.uk/codebox/email-sig/images/youtube.gif]http://www.youtube.com/universityofexeter[http://www.exeter.ac.uk/codebox/email-sig/images/li.gif]http://www.linkedin.com/groups/University-Exeter-109267?mostPopular=&gid=109267

This email and any attachment may contain information that is confidential, privileged, or subject to copyright, and which may be exempt from disclosure under applicable legislation. It is intended for the addressee only. If you received this message in error, please let me know and delete the email and any attachments immediately. The University will not accept responsibility for the accuracy/completeness of this e-mail and its attachments.

From: zhaoyanswill [mailto:notifications@github.com] Sent: 17 June 2014 19:41 To: zhaoyanswill/RAPSearch2 Cc: Sharma, M D Subject: Re: [RAPSearch2] RapSearch2 v2.12 paired .aln files cannot be read by MEGAN 5.3.0 (#5)

Hi,

Paired alignment file has different format to show information of two alignments. I'm not sure if MEGAN supports it! Thanks!

Sincerely, Yongan

On 6/17/2014 1:33 AM, MD Sharma wrote:

It can't be just that.. All single read .aln files work with MEGAN.. The .pir.aln (paired) files generate the message I shared earlier. Thoughts?

zhaoyanswill notifications@github.com<mailto:notifications@github.com<mailto:notifications@github.com%3cmailto:notifications@github.com>> wrote:

Hi,

It looks that MEGAN wants .m8 file rather than .aln file. Thanks!

Sincerely, Yongan

On 6/16/2014 6:47 PM, MD Sharma wrote:

Hello,

I will be raising this issue with the MEGAN authors anyway but wanted to make sure I ran this by you first. I have successfully used v2.12 to run paired end blast searches against the nr database however, when I try to import the pir.aln or the pir.m8 files into MEGAN 5.3.0 it crashes citing - what seems to be - format related issues. Please see the error messages below:

Executing: import blastFile=xxxx_data/rap_results/LM_4601/LM_4601.rap.pir.aln meganFile=xxxx_data/rap_results/LM_4601/LM_4601.rap.pir.aln.rma minScore=50.0 maxExpected=1.0 topPercent=10 minSupport=5 minComplexity=0.0 useMinimalCoverageHeuristic=false useSeed=false useCOG=false useKegg=true paired=false useIdentityFilter=false textStoragePolicy=0 blastFormat=RapSearch mapping='Taxonomy:GI_MAP=true,KEGG:GI_MAP=true'; Importing data: Importing data: 0 reads file(s), 1 blast file(s) Input format: RapSearch TextStoragePolicy: Embed matches and reads in MEGAN file Will stream through reads, not load them into memory (assumes that reads occur in same order in BLAST and FASTA files) Processing LM_4601.rap.pir.aln Processing LM_4601.rap.pir.aln Processing RapSearch file(s) Parse error: java.io.IOException: Failed to parse 'bits=' in:

M02023:17:000000000-A53F2:1:1101:16405:1365[5] vs gi|488585939|ref|XP_004478316.1| bits=28.4906[28.4906] logSum(E-value)=0.149291 log(E-value)=3.03[3.03] identity=75%[75]% aln-len=16[16] mismatch=4[4] gap-openings=0[0] nFrame=1[4] Parse error: java.io.IOException: Token '>' not found at start of line: Query: 44 ELEKDDLGYLVEEISK 91 89 ELEKDDLGYLVEEISK 42 Parse error: java.io.IOException: Token '>' not found at start of line: ELE ++LGYL EEISK ELE ++LGYL EEISK Parse error: java.io.IOException: Token '>' not found at start of line: Sbjct: 563 ELESEELGYLAEEISK 578 563 ELESEELGYLAEEISK 578 Parse error: java.io.IOException: Failed to parse 'bits=' in: M02023:17:000000000-A53F2:1:1101:16405:1365[5] vs gi|565835107|ref|WP_023918686.1| bits=27.335[27.335] logSum(E-value)=0.809676 log(E-value)=3.38[3.38] identity=84.6154%[84.6154]% aln-len=13[13] mismatch=2[2] gap-openings=0[0] nFrame=2[5]

Any thoughts on this?

Cheers, MD

— Reply to this email directly or view it on GitHub https://github.com/zhaoyanswill/RAPSearch2/issues/5.

— Reply to this email directly or view it on

GitHubhttps://github.com/zhaoyanswill/RAPSearch2/issues/5#issuecomment-46260861.

— Reply to this email directly or view it on GitHub

https://github.com/zhaoyanswill/RAPSearch2/issues/5#issuecomment-46268887.

— Reply to this email directly or view it on GitHubhttps://github.com/zhaoyanswill/RAPSearch2/issues/5#issuecomment-46347887.

— Reply to this email directly or view it on GitHub https://github.com/zhaoyanswill/RAPSearch2/issues/5#issuecomment-49253189.

— Reply to this email directly or view it on GitHubhttps://github.com/zhaoyanswill/RAPSearch2/issues/5#issuecomment-49255378.

zhaoyanswill commented 10 years ago

Sorry about that! :D

Sincerely, Yongan

On 7/16/2014 11:45 PM, MD Sharma wrote:

No worries.. glad to have found it.. I was going mental thinking that I was doing something bizarre!

Dr. M D Sharma Associate Research Fellow Centre for Ecology & Conservation College of Life and Environmental Sciences University of Exeter Cornwall Campus Penryn TR10 9EZ

M.D.Sharma@Exeter.ac.ukhttps://legacy.exeter.ac.uk/owa/redir.aspx?C=2550a33f6b904272a7b1147002360f76&URL=mailto%3aM.D.Sharma%40Exeter.ac.uk

http://www.publicationslist.org/MD.Sharmahttps://legacy.exeter.ac.uk/owa/redir.aspx?C=2550a33f6b904272a7b1147002360f76&URL=http%3a%2f%2fwww.publicationslist.org%2fMD.Sharma

http://www.researcherid.com/rid/F-8530-2013

Shared Tel: (+44) 1326 259384 Mob: (+44) 7919 242450 [http://www.exeter.ac.uk/codebox/email-sig/images/logo.gif]http://www.exeter.ac.uk/

[http://www.exeter.ac.uk/codebox/email-sig/images/fb.gif]http://www.facebook.com/exeteruni[http://www.exeter.ac.uk/codebox/email-sig/images/twitter.gif]http://twitter.com/uniofexeter[http://www.exeter.ac.uk/codebox/email-sig/images/youtube.gif]http://www.youtube.com/universityofexeter[http://www.exeter.ac.uk/codebox/email-sig/images/li.gif]http://www.linkedin.com/groups/University-Exeter-109267?mostPopular=&gid=109267

This email and any attachment may contain information that is confidential, privileged, or subject to copyright, and which may be exempt from disclosure under applicable legislation. It is intended for the addressee only. If you received this message in error, please let me know and delete the email and any attachments immediately. The University will not accept responsibility for the accuracy/completeness of this e-mail and its attachments.

From: zhaoyanswill [mailto:notifications@github.com] Sent: 17 July 2014 04:42 To: zhaoyanswill/RAPSearch2 Cc: Sharma, M D Subject: Re: [RAPSearch2] RapSearch2 v2.12 paired .aln files cannot be read by MEGAN 5.3.0 (#5)

Hi,

Thank you! I just noticed this bug. I'll fix it as soon as possible.

You may want to try previous version of RAPSearch2 or write a script to add '>' in the first line of every file lines of .aln file.

Sincerely, Yongan

On 7/16/2014 10:52 PM, MD Sharma wrote:

Hello,

I took R1 and R2 files from a paired end read and merged them into a single R12 file. Then ran this through Rapsearch2 to get a .aln and a .m8 file. Every single time I try to load either of these files in MEGAN 5.3, I keep getting this error:

Parse error: java.io.IOException: Token '>' not found at start of line: M02023:25:000000000-A6UEU:1:1101:16095:1392

Obviously here the line number would differ depending on the sequence ID etc. I have verified that the source fasta files do have a “>” at the start.. for e.g.:

M02023:25:000000000-A6UEU:1:1101:17158:1732 1:N:0:23 GAATGGAATGGAATGGGGTGGAATAGAATGGAGTGGAGTGC

However, the .aln files generated via Rapsearch2.18 do not seem to have the “>” at the start of each new line: M02023:25:000000000-A6UEU:1:1101:16095:1392

Any thoughts?

Best, MD

Dr. M D Sharma Associate Research Fellow Centre for Ecology & Conservation College of Life and Environmental Sciences University of Exeter Cornwall Campus Penryn TR10 9EZ

M.D.Sharma@Exeter.ac.ukhttps://legacy.exeter.ac.uk/owa/redir.aspx?C=2550a33f6b904272a7b1147002360f76&URL=mailto%3aM.D.Sharma%40Exeter.ac.ukmailto:M.D.Sharma@Exeter.ac.uk%3chttps://legacy.exeter.ac.uk/owa/redir.aspx?C=2550a33f6b904272a7b1147002360f76&URL=mailto%3aM.D.Sharma%40Exeter.ac.uk

http://www.publicationslist.org/MD.Sharmahttps://legacy.exeter.ac.uk/owa/redir.aspx?C=2550a33f6b904272a7b1147002360f76&URL=http%3a%2f%2fwww.publicationslist.org%2fMD.Sharmahttp://www.publicationslist.org/MD.Sharma%3chttps:/legacy.exeter.ac.uk/owa/redir.aspx?C=2550a33f6b904272a7b1147002360f76&URL=http%3a%2f%2fwww.publicationslist.org%2fMD.Sharma

http://www.researcherid.com/rid/F-8530-2013

Shared Tel: (+44) 1326 259384 Mob: (+44) 7919 242450

[http://www.exeter.ac.uk/codebox/email-sig/images/logo.gif]http://www.exeter.ac.uk/

[http://www.exeter.ac.uk/codebox/email-sig/images/fb.gif]http://www.facebook.com/exeteruni[http://www.exeter.ac.uk/codebox/email-sig/images/twitter.gif]http://twitter.com/uniofexeter[http://www.exeter.ac.uk/codebox/email-sig/images/youtube.gif]http://www.youtube.com/universityofexeter[http://www.exeter.ac.uk/codebox/email-sig/images/li.gif]http://www.linkedin.com/groups/University-Exeter-109267?mostPopular=&gid=109267

This email and any attachment may contain information that is confidential, privileged, or subject to copyright, and which may be exempt from disclosure under applicable legislation. It is intended for the addressee only. If you received this message in error, please let me know and delete the email and any attachments immediately. The University will not accept responsibility for the accuracy/completeness of this e-mail and its attachments.

From: zhaoyanswill [mailto:notifications@github.com] Sent: 17 June 2014 19:41 To: zhaoyanswill/RAPSearch2 Cc: Sharma, M D Subject: Re: [RAPSearch2] RapSearch2 v2.12 paired .aln files cannot be read by MEGAN 5.3.0 (#5)

Hi,

Paired alignment file has different format to show information of two alignments. I'm not sure if MEGAN supports it! Thanks!

Sincerely, Yongan

On 6/17/2014 1:33 AM, MD Sharma wrote:

It can't be just that.. All single read .aln files work with MEGAN.. The .pir.aln (paired) files generate the message I shared earlier. Thoughts?

zhaoyanswill

notifications@github.com<mailto:notifications@github.com<mailto:notifications@github.com%3cmailto:notifications@github.com>> wrote:

Hi,

It looks that MEGAN wants .m8 file rather than .aln file. Thanks!

Sincerely, Yongan

On 6/16/2014 6:47 PM, MD Sharma wrote:

Hello,

I will be raising this issue with the MEGAN authors anyway but wanted to make sure I ran this by you first. I have successfully used v2.12 to run paired end blast searches against the nr database however, when I try to import the pir.aln or the pir.m8 files into MEGAN 5.3.0 it crashes citing - what seems to be - format related issues. Please see the error messages below:

Executing: import blastFile=xxxx_data/rap_results/LM_4601/LM_4601.rap.pir.aln meganFile=xxxx_data/rap_results/LM_4601/LM_4601.rap.pir.aln.rma minScore=50.0 maxExpected=1.0 topPercent=10 minSupport=5 minComplexity=0.0 useMinimalCoverageHeuristic=false useSeed=false useCOG=false useKegg=true paired=false useIdentityFilter=false textStoragePolicy=0 blastFormat=RapSearch mapping='Taxonomy:GI_MAP=true,KEGG:GI_MAP=true'; Importing data: Importing data: 0 reads file(s), 1 blast file(s) Input format: RapSearch TextStoragePolicy: Embed matches and reads in MEGAN file Will stream through reads, not load them into memory (assumes that reads occur in same order in BLAST and FASTA files) Processing LM_4601.rap.pir.aln Processing LM_4601.rap.pir.aln Processing RapSearch file(s) Parse error: java.io.IOException: Failed to parse 'bits=' in:

M02023:17:000000000-A53F2:1:1101:16405:1365[5] vs gi|488585939|ref|XP_004478316.1| bits=28.4906[28.4906] logSum(E-value)=0.149291 log(E-value)=3.03[3.03] identity=75%[75]% aln-len=16[16] mismatch=4[4] gap-openings=0[0] nFrame=1[4] Parse error: java.io.IOException: Token '>' not found at start of line: Query: 44 ELEKDDLGYLVEEISK 91 89 ELEKDDLGYLVEEISK 42 Parse error: java.io.IOException: Token '>' not found at start of line: ELE ++LGYL EEISK ELE ++LGYL EEISK Parse error: java.io.IOException: Token '>' not found at start of line: Sbjct: 563 ELESEELGYLAEEISK 578 563 ELESEELGYLAEEISK 578 Parse error: java.io.IOException: Failed to parse 'bits=' in: M02023:17:000000000-A53F2:1:1101:16405:1365[5] vs gi|565835107|ref|WP_023918686.1| bits=27.335[27.335] logSum(E-value)=0.809676 log(E-value)=3.38[3.38] identity=84.6154%[84.6154]% aln-len=13[13] mismatch=2[2] gap-openings=0[0] nFrame=2[5]

Any thoughts on this?

Cheers, MD

— Reply to this email directly or view it on GitHub https://github.com/zhaoyanswill/RAPSearch2/issues/5.

— Reply to this email directly or view it on

GitHubhttps://github.com/zhaoyanswill/RAPSearch2/issues/5#issuecomment-46260861.

— Reply to this email directly or view it on GitHub

https://github.com/zhaoyanswill/RAPSearch2/issues/5#issuecomment-46268887.

— Reply to this email directly or view it on

GitHubhttps://github.com/zhaoyanswill/RAPSearch2/issues/5#issuecomment-46347887.

— Reply to this email directly or view it on GitHub

https://github.com/zhaoyanswill/RAPSearch2/issues/5#issuecomment-49253189.

— Reply to this email directly or view it on GitHubhttps://github.com/zhaoyanswill/RAPSearch2/issues/5#issuecomment-49255378.

— Reply to this email directly or view it on GitHub https://github.com/zhaoyanswill/RAPSearch2/issues/5#issuecomment-49255499.