Trinotate / Trinotate.github.io

web documentation for Trinotate
47 stars 17 forks source link

Issue with parsing names of blast hits in the final report table #34

Closed jcoludar closed 4 years ago

jcoludar commented 4 years ago

Hi guys, I have made a few test runs of Trinotate pipeline on several venom gland transcriptome datasets, and I have troubles making the top blast hit columns (custom and Sprot) be anyhow meaningful. Most of the hits are just blanks, but the ones that have anything filled in look like this: COX1_RHISA^COX1_RHISA^Q:1-85,H:20-3^Cytochrome%ID^E:0^RecName: With the subsequent columns then picking up and having (new line for each of the column) Full=Cytochrome c oxidase subunit 1;^Eukaryota;

The last one is the column that supposed to have "custom database blastx".

Would highly appreciate if you have any suggestions on how to fix that.

Cheers, Ivan

brianjohnhaas commented 4 years ago

Hi,

It sounds like there's something corrupt in the way the blast header info is being reported for the custom database. Is there something peculiar about how the headers of those entries exist in the fasta file that's being searched? I'm wondering if there are tabs or something embedded in the fasta headers that are being propagated to the report.

In any case, if you want to share your sqlite database with me privately, I can look into the issue some more.

bhaas@broadinstitute.org

best,

~brian

On Thu, Feb 27, 2020 at 5:55 AM jcoludar notifications@github.com wrote:

Hi guys, I have made a few test runs of Trinotate pipeline on several venom gland transcriptome datasets, and I have troubles making the top blast hit columns (custom and Sprot) be anyhow meaningful. Most of the hits are just blanks, but the ones that have anything filled in look like this: COX1_RHISA^COX1_RHISA^Q:1-85,H:20-3^Cytochrome%ID^E:0^RecName: With the subsequent columns then picking up and having (new line for each of the column) Full=Cytochrome c oxidase subunit 1;^Eukaryota;

The last one is the column that supposed to have "custom database blastx".

Would highly appreciate if you have any suggestions on how to fix that.

Cheers, Ivan

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/Trinotate/Trinotate.github.io/issues/34?email_source=notifications&email_token=ABZRKX7VBZZIDHXEKYW2LZTRE6L2BA5CNFSM4K4YXJI2YY3PNVWWK3TUL52HS4DFUVEXG43VMWVGG33NNVSW45C7NFSM4IQX3OXA, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABZRKX4JZ7L3YEUJOPVUOI3RE6L2BANCNFSM4K4YXJIQ .

--

Brian J. Haas The Broad Institute http://broadinstitute.org/~bhaas http://broad.mit.edu/~bhaas

briandiamondage-sema4 commented 4 years ago

Hi Ivan,

Thanks for sharing. I generated the report from your sqlite database and it appears that the tab-delimited fields are where they should be. If you're importing the file into MS-excel or something, be sure to have it delimit on tabs only and then all the columns will line up.

Let's take it from there to see what changes might be needed beyond that.

best,

~brian

jcoludar commented 4 years ago

Hi Brian,

Thanks a lot for looking into the sqlite database. I made sure that the table was treated as tab delineated, however it seems that something went wrong with the construction of the table from the database in the first place. I've rerun it and it produced a properly parsed one (and two times smaller in size).

If you don't mind me asking another question: is there a way to use several custom databases with Trinotate? Without tricking it and uploading different db search results as customdb blastp and customdb blastx.

Regards, Ivan

brianjohnhaas commented 4 years ago

Hi Ivan,

I'm glad it looks ok now. You should be able to upload as many custom databases as you want. They should end up as additional columns in the report.

best,

~brian

On Tue, Mar 3, 2020 at 5:23 AM jcoludar notifications@github.com wrote:

Hi Brian,

Thanks a lot for looking into the sqlite database. I made sure that the table was treated as tab delineated, however it seems that something went wrong with the construction of the table from the database in the first place. I've rerun it and it produced a properly parsed one (and two times smaller in size).

If you don't mind me asking another question: is there a way to use several custom databases with Trinotate? Without tricking it and uploading different db search results as customdb blastp and customdb blastx.

Regards, Ivan

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/Trinotate/Trinotate.github.io/issues/34?email_source=notifications&email_token=ABZRKX6HDMXBUIPR4NOOKQTRFTLBLA5CNFSM4K4YXJI2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOENS5EZI#issuecomment-593875557, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABZRKXYRM5KWXPFTBOYV7G3RFTLBLANCNFSM4K4YXJIQ .

--

Brian J. Haas The Broad Institute http://broadinstitute.org/~bhaas http://broad.mit.edu/~bhaas

jcoludar commented 4 years ago

Hi Brian,

Thanks a lot! So just loading them with LOAD_custom_blast with different custom_db names would work? Sounds really nice ).

Regards, Ivan

brianjohnhaas commented 4 years ago

exactly. Let me know if it gives any trouble.

On Tue, Mar 3, 2020 at 11:07 AM jcoludar notifications@github.com wrote:

Hi Brian,

Thanks a lot! So just loading them with LOAD_custom_blast with different custom_db names would work? Sounds really nice ).

Regards, Ivan

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/Trinotate/Trinotate.github.io/issues/34?email_source=notifications&email_token=ABZRKX4DUAF4ISGSI2OYUKLRFUTK7A5CNFSM4K4YXJI2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOENUCXBA#issuecomment-594029444, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABZRKX6BTPHEJLS2CBKNJZLRFUTK7ANCNFSM4K4YXJIQ .

--

Brian J. Haas The Broad Institute http://broadinstitute.org/~bhaas http://broad.mit.edu/~bhaas