ISUgenomics / bioinformatics-workbook

Bioinformatics Workbook repository
https://bioinformaticsworkbook.org
MIT License
174 stars 89 forks source link

Braker2: Acquiring Transcript/EST data for a de novo genome #64

Closed DustinSokolowski closed 2 years ago

DustinSokolowski commented 2 years ago

Hello!

Thank you for the great bioinformatics workbook. I've been using the Braker tutorial as inspiration to annotate me own de novo genome and it's been quite helpful this far.

I am trying to generate the transcript/EST information as part of the (required?) input for Braker2. Is the Transcript/EST information a different datatype entirely than RNA-seq, and if I don't have it then this step must be skipped? Or can it be generated from the RNA-seq data?

I was thinking that if it can be generated then perhaps I'd generate a gtf file with stringtie/guided trinity, pull the fasta file from that with bedtools, and then re-align the fasta to the genome. This being said I may be way off-base and missing something obvious.

If you have any advice here then that would be very valuable!

Best, Dustin

remkv6 commented 2 years ago

Hi Dustin,

The EST’s/transcripts are not required, though using Uniprot (manually curated) proteins will improve your prediction. You can absolutely skip it. Good luck!

Rick

From: Dustin Sokolowski @.> Sent: Tuesday, March 15, 2022 4:40 PM To: ISUgenomics/bioinformatics-workbook @.> Cc: Subscribed @.***> Subject: [ISUgenomics/bioinformatics-workbook] Braker2: Acquiring Transcript/EST data for a de novo genome (Issue #64)

Hello!

Thank you for the great bioinformatics workbook. I've been using the Braker tutorial as inspiration to annotate me own de novo genome and it's been quite helpful this far.

I am trying to generate the transcript/EST information as part of the (required?) input for Braker2. Is the Transcript/EST information a different datatype entirely than RNA-seq, and if I don't have it then this step must be skipped? Or can it be generated from the RNA-seq data?

I was thinking that if it can be generated then perhaps I'd generate a gtf file with stringtie/guided trinity, pull the fasta file from that with bedtools, and then re-align the fasta to the genome. This being said I may be way off-base and missing something obvious.

If you have any advice here then that would be very valuable!

Best, Dustin

— Reply to this email directly, view it on GitHub https://github.com/ISUgenomics/bioinformatics-workbook/issues/64 , or unsubscribe https://github.com/notifications/unsubscribe-auth/ACCLTPEE4I3XQ6Q42YBO5HDVAD7UZANCNFSM5Q2AWSOQ . Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub . You are receiving this because you are subscribed to this thread. https://github.com/notifications/beacon/ACCLTPHRUZVKV3Y73RPDGKTVAD7UZA5CNFSM5Q2AWSO2YY3PNVWWK3TUL52HS4DFUVEXG43VMWVGG33NNVSW45C7NFSM4ROBJHZQ.gif Message ID: @. @.> >

DustinSokolowski commented 2 years ago

Hey!

Thanks so much for clearing it up!

Best, Dustin